CS6240: Parallel Data Processing in MapReduceCS6240: Parallel Data Processing in MapReduce: "Pig; MapReduce design patterns Read the Pig paper. Read chapters 8 and 11 in the Tom White book."
'via Blog this'
How to process a million songs in 20 minutes « Music MachineryHow to process a million songs in 20 minutes « Music Machinery: "The recently released Million Song Dataset (MSD), a collaborative project between The Echo Nest and Columbia’s LabROSA is a fantastic resource for music researchers. It contains detailed acoustic and contextual data for a million songs. However, getting started with the dataset can be a bit daunting. First of all, the dataset is huge (around 300 gb) which is more than most people want to download. Second, it is such a big dataset that processing it in a traditional fashion, one track at a time, is going to take a long time. Even if you can process a track in 100 milliseconds, it is still going to take over a day to process all of the tracks in the dataset. Luckily there are some techniques such as Map/Reduce that make processing big data scalable over multiple CPUs. In this post I shall describe how we can use Amazon’s Elastic Map Reduce to easily process the million song dataset."
HiToRiGoTo: Font problems using Texmaker on Ubuntu 10.04HiToRiGoTo: Font problems using Texmaker on Ubuntu 10.04: "Font problems using Texmaker on Ubuntu 10.04
24 ways: web design and development articles and tutorials for advent24 ways: web design and development articles and tutorials for advent: "About 24 ways
fourteen years of pantone colors-of-the-year (tecznotes)fourteen years of pantone colors-of-the-year (tecznotes): "2000: Cerulean Blue.
Cerulean Blue - Pantone WikiCerulean Blue - Pantone Wiki: "Pantone's Cerulean Blue 15-4020 was the 2000 Color of the Year.
Ubuntu -- Package Contents Search Results -- pdflatexUbuntu -- Package Contents Search Results -- pdflatex: "texlive-latex-base"
Keep up with the latest Advogato features by reading the Advogato status blog.