Visualizing Weezers Energy-Distribution

Schöne Spielerei mit Statistiken und Mucke: ArtistX visualisiert die Echo Nest-Datenbank und die Attribute, die Künstlern dort zugewiesen werden. So kann man dann zum Beispiel Weezers Brett-Verteilung ansehen. Sweet!
Distribution: Type the name of the artist, select an attribute and see the distribution of songs with that attribute. Click on a bar to start playing songs with that range of values for the attribute. Scatter: Type the name of the artist, select attributes for the X, Y axis along with the color and size of the displayed points. Click on a song to hear it.
Attributes
Danceability – how danceable a song is. 0 is least danceable, 100 is most danceable.
Duration – the length of the song in seconds.
Energy – the overall energy of the song, 0 is least, 100 is most.
Hotttnesss – the popularity of the song, 0 is least, 100 is most.
Key – the key the song. 0 is C, 1 is C# and so on.
Liveness – the likelihood that a song was performed in front of an audience. Above 80 is usually live.
Loudness – the overall loudness of the song in decibels.
Mode – the mode of the song where major is 0 and minor is 1.
Speechiness – how much spoken word is in the song. 0 is least, 100 is most.
Tempo – the most frequently occuring tempo in the song, in beats-per-minute.
Time signature – the number of beats per measure in the song.
Interactive Data-Visualization of Lord Of The Rings

Das Lord Of The Rings-Project analysiert schon seit einer ganzen Weile so Sachen wie Wortanzahl und Geographie aus dem Herrn der Ringe und hat dazu jetzt eine nette interaktive Datenvisualisierung online gestellt: An interactive Analysis of Tolkiens Works – The Silmarillion, The Hobbit and The Lord Of The Rings, da gibt’s dann unter anderem „keyword frequency search, character mentions, sentiment analysis and network diagrams“. Sweet!
The analysis is based on the Silmarillion, the Hobbit and the Lord of the Rings trilogy. The fact that pages with illustrations have not been included means the page numbers may be slightly off. Since the index in the Silmarillion and the Appendices to the Lord of the Rings are not part of the narrative they have not been included.
An interactive Analysis of Tolkiens Works – The Silmarillion, The Hobbit and The Lord Of The Rings (via /.)
Deep Inside: Data-Analysis of 10.000 Porn Stars

Ziemlich ausführliche Studie von Jon Millard, der 10.000 Datensätze aus der Internet Adult Film Database gezogen und ausgewertet hat.
For the first time, a massive data set of 10,000 porn stars has been extracted from the world’s largest database of adult films and performers. I’ve spent the last six months analyzing it to discover the truth about what the average performer looks like, what they do on film, and how their role has evolved over the last forty years.
Deep Inside – A Study of 10.000 Porn Stars and their Careers (via Waxy)
Dazu gibt’s auch eine ziemlich große Infografik nach dem Klick: Gib mir den Rest, Baby…
Walking Dead Zombiekill-Statistics in a giant Infographic

Andrew Barr und Richard Johnson haben für die National Post eine detailierte Infografik über alle Zombie-Kills in der Walking Dead-Serie produziert: Graphic: Stopping the Dead – a statistical look back at the Walking Dead series so far (via io9). Nach dem Klick das komplette Ding (JPG, 9,4 MB).
Andrew Barr and Richard Johnson look at a few of the key statistics of two-and-a-half season’s worth of undead mayhem. They find noteworthy – the gradual increase in the body count, the increasingly creative means of Zombie dispatch, and the fact that every character seems to have developed a clear enjoyment for putting the ambulatory cadavers down for good.
Fuck that Crap, dammit: The WTF-Level of Twitter

Toller Spaß mir Stats: WTFLevel analysiert die Twitter-Timeline auf den Anteil der Fucks und Shits und WTFs und berechnet daraus den WTFLevel, sowas wie einen Defcon-System für Fuckshit-Arghs inklusive API und Scripten, die man auf seiner Seite einbinden kann. Im Moment schwanken wir so zwischen milder Rumflucherei und Hasstiraden, für Netzverhältnisse also ziemlich friedlich.
WTFLevel.com is a project to track and monitor the amount of swearing on Twitter at any given moment. It’s mostly a humorous attempt to get an idea of how aggravated the planet is at any moment. We continuously check Twitter for references to a list of swear words. The list is private to prevent anyone from trying to manipulate the system, but as an example, the Seven Dirty Words are all on the list. At the moment, the list is made up of only swears in English. Every 10 seconds we total up the count of all tweets containing a swear, and figure out the rate of sweary tweets from that. The stats on the website are updated in real-time with any new data. […]
Using the Twitter Streaming API, I scan tweets for a collection a swear words and other curse-like expressions. I calculate two values from that data: the rate of tweets which contain swears to those that do not contain swears, and also the magnitude of sweariness in those tweets. For example, a tweet with more swears in it has a higher magnitude than one which only has one swear in it.
WTFlevel.com: Real-Time Tracking of Swearing on Twitter (via MeFi)
Mathematics of a Serial Killer
Statistiker haben gestern ein mathematisches Modell für die Morde von Andrei Chikatilo vorgelegt, der in den 80ern über 50 Menschen umbrachte. Dazu haben sie das Verhalten von Neuronen einberechnet sowie verschiedene Faktoren wie Logistik und Planung.
What the authors used as the basis of their analysis was the hypothesis that “similar to epileptic seizures, the psychotic affects, causing a serial killer to commit murder, arise from simultaneous firing of large number of neurons in the brain.” Accordingly, they based their model on neuronal firing – the fact that, once a neuron fires, there’s a refractory period that has to pass before it can fire again. When it does fire, it can trigger other neurons to fire if they’re ready to. As you can imagine, though, those firings aren’t always in sync. So what the authors suggest is that there must be a threshold – that is, when a certain number of neurons fire, the serial killer becomes driven by an overwhelming urge to kill.
In modeling the mathematics of this, the authors note that, “We cannot expect that the killer commits murder right at the moment when neural excitation reaches a certain threshold. He needs time to plan and prepare his crime.” So they built that delay into their model as well. Moreover, the authors also note that the murders do appear in clumps, with the killer more likely to kill after another murder. However, the killings eventually have a sedative effect, pushing the neuronal activity below the “killing threshold” – which is why there are large intervals of time between groups of murders.
When the authors completed their mathematical model, it was remarkably close to the real data.
Scientists Uncover The Mathematics Of Serial Killers (via /.), mehr: hier das PDF „Stochastic modeling of a serial killer“, Technology Review: Mathematicians Reveal Serial Killer’s Pattern of Murder, The Criminal Lawyer: Statistics and the Serial Killer
The Atlas of Economic Complexity

Meine gute Freundin Kate am MIT twitterte neulich Cesar A. Hidalgos „Atlas of Economic Complexity“, wenn man sich ein bisschen für Wirtschaft, Infoporn und Datenvisualisierung interessiert, ist das Ding eine einzige Fundgrube und ich lese das Ding jetzt seit drei Tagen (bin aber zugegebenermaßen immer noch dabei, die ganzen Infografiken zu entschlüsseln.) Auf der Website zum eBook gibt’s die ganzen Daten und Infografiken in einer interaktiven Version zum Rumspielen. Das Ding ist sowas wie eine lange und ausführlichere Version von Hans Roslings Doku „The Joy of Stats“. Sehr schön!
Statistical Distribution Plushies

Wenn ich jetzt etwas mehr Ahnung von Statistik hätte, würden mir diese Verteilungs-Plushies mehr sagen, ich kenne aber nur die Gaußsche Normalverteilung aus der Physik und die ist nichtmal dabei. Aber egal, super Idee.
Light Green Standard Normal Distribution
Baby Blue t Distribution
Light Yellow Chi-Square Distribution
Light Pink Log Normal Distribution
Lilac Continuous Uniform Distribution
Tan Weibull Distribution
Olive Green Cauchy Distribution
Slate Blue Poisson Distribution
Maroon Gumbel Distribution
Gray Erlang Distribution
Doku: The Joy of Stats
(Youtube Direktstats, via Waxy, Gapminder)
Tolle Doku mit Hans Rosling über Statistik und Datenvisualisierung. Ein Clip daraus („200 Countries, 200 Years, 4 Minutes“) machte in den vergangenen Wochen die Runde, hier jetzt die komplette, superinteressante Doku, die man auf DVD hier im Shop von Wingspan Productions kaufen kann und wer mehr von dem Kram will: Von Hans Rosling gibt es gleich eine ganze Reihe von TED Talks.
Snip von der BBC:
Documentary which takes viewers on a rollercoaster ride through the wonderful world of statistics to explore the remarkable power thay have to change our understanding of the world, presented by superstar boffin Professor Hans Rosling, whose eye-opening, mind-expanding and funny online lectures have made him an international internet legend.
Rosling is a man who revels in the glorious nerdiness of statistics, and here he entertainingly explores their history, how they work mathematically and how they can be used in today’s computer age to see the world as it really is, not just as we imagine it to be. Rosling’s lectures use huge quantities of public data to reveal the story of the world’s past, present and future development. Now he tells the story of the world in 200 countries over 200 years using 120,000 numbers – in just four minutes.
The film also explores cutting-edge examples of statistics in action today. In San Francisco, a new app mashes up police department data with the city’s street map to show what crime is being reported street by street, house by house, in near real-time. Every citizen can use it and the hidden patterns of their city are starkly revealed. Meanwhile, at Google HQ the machine translation project tries to translate between 57 languages, using lots of statistics and no linguists.
NC is ahead of the Game (when it comes to Chewbacca)

Michael Heilemanns Chewbacca-Story, die ich neulich verlinkt hatte, ging ziemlich durch die Decke. Deshalb hat er die Referrer des Traffics analysiert und die Grafik zeigt ziemlich deutlich, was Ihr an diesem kleinen Blog hier habt. Und mehr sage ich dazu nicht, das Bild spricht für sich selbst. Ha!
Chewie dropped it like it was hot the saturday before last and traffic spiked at one hundred times normal late last week, for a total of over 50.000 unique visits. Aside from worrying amounts of stat masturbation, I also took the opportunity to glean some insights into the referrers and visitors interested in everyone’s favorite furry.
Favicon-Map of the Web
![]()
Schöne Fleißarbeit: Nmap.com hat sich von der ersten Million der zugriffsstärksten Seiten weltweit die Favicons abgegriffen und die Icons so halbproportional zum Traffic auf einer Karte abgebildet. Eigentlich müsste Nerdcore da auch drauf sein (Alexa Ranking weltweit von 15.553), aber ich find’s nicht, Geduld zum Suchen habe ich nicht und es ist mir ja auch eigentlich eher egal. Meh. Trotzdem schönes Dings.
A large-scale scan of the top million web sites (per Alexa traffic data) was performed in early 2010 using the Nmap Security Scanner and its scripting engine.
We retrieved each site’s icon by first parsing the HTML for a link tag and then falling back to /favicon.ico if that failed. 328,427 unique icons were collected, of which 288,945 were proper images. […] The area of each icon is proportional to the sum of the reach of all sites using that icon. […] The smallest icons–those corresponding to sites with approximately 0.0001% reach–are scaled to 16×16 pixels. The largest icon (Google) is 11,936 x 11,936 pixels, and the whole diagram is 37,440 x 37,440.
Icons of the Web (via MeFi)
Google Pacman Stats
Dass ich in den letzten Tagen nichts zu Googles Pacman-Doodle (jetzt mit eigener URL und als Download) gebloggt hatte, lag natürlich nicht an meinem Wochenends-Faulheits-Modus (in der „Verlängertes Wochenende“-extended Edition), wäre ja noch schöner. Nee, ich habe einfach nur die Wirtschaft gerettet, denn:
* Google Pac-Man consumed 4,819,352 hours of time (beyond the 33.6m daily man hours of attention that Google Search gets in a given day)
* $120,483,800 is the dollar tally, If the average Google user has a COST of $25/hr (note that cost is 1.3 – 2.0 X pay rate).
* For that same cost, you could hire all 19,835 google employees, from Larry and Sergey down to their janitors, and get 6 weeks of their time. Imagine what you could build with that army of man power.
* $298,803,988 is the dollar tally if all of the Pac-Man players had an approximate cost of the average Google employee.
The Tragic Cost of Google Pac-Man – 4.82 million hours (via /.)
LastFM Open Mind Index
![]() ![]() |
Mein Open-Mind-Index laut LastFM liegt bei 110, damit kann ich leben. Bei Lars mit seinen insgesamt sechs verschiedenen Metal-Genres im Listing ist der glaube ich etwas niedriger, hrhrhr… |
NINs Gratis-Album Download-Statistik in Google Earth

NINs letztes Album „The Slip“ gab es für lau zum Download (vorher bei Nerdcore), jetzt haben sie die Download-Zahlen geographisch aufgesplittet veröffentlicht… als Google-Earth-Datei. Den Rekord der am meisten runtergeladenen Alben halten übrigens die Fiji Inseln.
Our computer wizards presented some data to us in an interesting way and we thought we’d share it with you. CLICK HERE to download a Google Earth KML file representing downloads of our latest record The Slip according to geographic region. We’ve had just over 1,400,000 people download The Slip from our site since its release May 5th (that number represents individual people, and excludes multiple downloads from the same order), and this map displays ONLY downloads that came directly from us.
Star Trek „Red Shirt“-Statistics
What? You don’t know about the Red Shirt Phenomenon? Well, as any die-hard Trekkie knows, if you are wearing a red shirt and beam to the planet with Captain Kirk, you’re gonna die. That’s the common thinking, but I decided to put this to the test. After all, I hadn’t seen any definitive proof; it’s just what people said. (Remind you of your current web analytics strategy?) So, let’s set our phasers on ‘stun’ and see what we find…
The basic stats:
The Enterprise has a crew of 430 (startrek.com) in its five-year mission. (Now, I know that the show was only on the air for 3 years, but bear with me. 80 episodes were produced, which gives us the data to build from.) 59 crewmembers were killed during the mission, which comes out to 13.7% of the crew. So, that will be our overall conversion rate, 13.7%.Data Segmentation:
However, we need to segment the overall mortality (conversion) rate in order to gain the specific information that we need:* Yellow-shirt crewperson deaths: 6 (10%)
* Blue-Shirt crewperson deaths: 5 (8 %)
* Engineering smock crewperson deaths: 4
* Red-Shirt crewperson deaths: 43 (73%)So, the basic segmentation of factors allows us to confirm that red-shirted crewmembers died more than any other crewmembers on the original Star Trek series.
However, that’s only just simple stats reporting – ready for some analysis?

Documentary which takes viewers on a rollercoaster ride through the wonderful world of statistics to explore the remarkable power thay have to change our understanding of the world, presented by superstar boffin Professor Hans Rosling, whose eye-opening, mind-expanding and funny online lectures have made him an international internet legend.



