Sunday, 3 October 2010

iTunes Statistics

I actively maintain my iTunes Library. Adding new songs, rating them and throwing away the garbage. After a while I wondered what the status was of my Library. How many songs are there in it? How is the rating of the songs distributed? Is it a normal distribution or is it slanted? How are playcounts and skipcounts built-up? In a first attempt to get a better insight into what is in my iTunes library I wrote the script below.

The output of the script on my system looks like this:


Number of tracks analysed: 6970


Frequency of Ratings
----- : 0
----* : 36
---** : 348
--*** : 6064
-**** : 423
***** : 99


Frequency of playcounts
0 : 122
1 : 3398
2 : 2136
3 : 591
4 : 190
5 : 144
6 : 173
7 : 83
8 : 38
9 : 25
10 : 32
11 : 10
12 : 6
13 : 5
14 : 2
15 : 2
16 : 2
17 : 2
18 : 0
19 : 2
20 : 2
21 : 2
22 : 2
23 : 0
24 : 0
25 : 0
26 : 0
27 : 0
28 : 0
29 : 0
30 : 1


Frequency of skipcounts
0 : 6886
1 : 83
2 : 1


Time taken talking with iTunes: 0s
Time taken analysing the data : 12s

New songs added to my library get a three-star rating. When I've listened to them I adjust the rating up or down appropriately. Songs rated with zero stars get thrown away. I find that I rarely have songs with just one star since they tend to be on the edge of zero and are often deemed unworthy of even the one star.

Looking at the data above, I was kind of surprised to see that the distribution of ratings is rather uniform. Most songs are plays only once or twice, due to the fact that I'm not playing music all day long. There's one songs with a playcount of 30. That's due to my son who likes to hear one particular song over and over... and over (Fireflies by Owl City).
Skipcounts are rather low, because I use another script to occasionally reset the skipcounts to 0 (subtracting the number of skips from the playcount). I'm on the fence as to whether that would be a good approach though. I wonder if skipping isn't more indicative of the song deserving a lower rating. But, that's something for another article.

If you like the script and use it, I'd appreciate it if you could post the results here in the comments of this article.

-- Initialise variables/constants
set flgActTunes to false

set ptrPlays to 0
set cntPlays to 0
set cntPlaysMax to 0
set lstPlays to {}

set ptrSkips to 0
set cntSkips to 0
set cntSkipsMax to 0
set lstSkips to {}

set lstRates to {"-----", "----*", "---**", "--***", "-****", "*****"}
set lstRateCnt to {0, 0, 0, 0, 0, 0}

set timings to {}

-- ** Fetch data from iTunes
set stime to (current date) --t1 start
-- Let's see if iTunes is running or not
tell application "System Events"
if (get name of every process) contains "iTunes" then set flgActTunes to true
end tell -- System Events

-- Interrogate iTunes to find out the information we will be needing
tell application "iTunes"
set {lstRatings, lstPlaycounts, lstSkipcounts, lstArtists} to {rating, played count, skipped count, artist} of the second playlist's tracks
-- If we started up iTunes for this, let's be nice and clean up after us
if flgActTunes = false then quit
end tell -- iTunes
set end of timings to ((current date) - stime) --t1 end

-- ** Analyse data
set stime to (current date) --t2 start
tell me
-- fill the list
repeat with i from 1 to (count of (items in lstRatings))
set val to item i of lstRatings
set cntPlays to item i of lstPlaycounts
set cntSkips to item i of lstSkipcounts
-- create a histogram of the ratings. Half-stars get counted as full.
if val < 1 then
set (item 1 of lstRateCnt) to (item 1 of lstRateCnt) + 1
else if val < 21 then
set (item 2 of lstRateCnt) to (item 2 of lstRateCnt) + 1
else if val < 41 then
set (item 3 of lstRateCnt) to (item 3 of lstRateCnt) + 1
else if val < 61 then
set (item 4 of lstRateCnt) to (item 4 of lstRateCnt) + 1
else if val < 81 then
set (item 5 of lstRateCnt) to (item 5 of lstRateCnt) + 1
else
set (item 6 of lstRateCnt) to (item 6 of lstRateCnt) + 1
end if
-- find out the maximum playcount 
if cntPlays > cntPlaysMax then set cntPlaysMax to cntPlays
-- find out the maximum skipcount
if cntSkips > cntSkipsMax then set cntSkipsMax to cntSkips
end repeat
end tell

tell me
-- prepare lists for the playcount histogram and skipcount histogram
repeat with i from 0 to cntPlaysMax
set end of lstPlays to 0
end repeat
repeat with i from 0 to cntSkipsMax
set end of lstSkips to 0
end repeat
repeat with i from 1 to (count of (items in lstRatings))
-- fill the histograms
set ptrPlays to (item i of lstPlaycounts) + 1
set ptrSkips to (item i of lstSkipcounts) + 1
copy (item ptrPlays of lstPlays) + 1 to item ptrPlays of lstPlays
copy (item ptrSkips of lstSkips) + 1 to item ptrSkips of lstSkips
end repeat
end tell
set end of timings to ((current date) - stime) --t2 end

-- output the data to a file on the desktop
tell me
write_data("Number of tracks analysed: " & (count of (items in lstRatings)))
write_data("")
write_data("Frequency of Ratings")
repeat with i from 1 to (count of lstRates)
write_data((item i of lstRates) & " : " & (item i of lstRateCnt) as string)
end repeat
write_data("")
write_data("Frequency of playcounts")
repeat with i from 1 to (count of items in lstPlays)
write_data((i - 1 as text) & " : " & (item i of lstPlays) as text)
end repeat
write_data("")
write_data("Frequency of skipcounts")
repeat with i from 1 to (count of items in lstSkips)
write_data((i - 1 as text) & " : " & (item i of lstSkips) as text)
end repeat
write_data("")
write_data(("Time taken talking with iTunes: " & (item 1 of timings) as string) & "s")
write_data(("Time taken analysing the data : " & (item 2 of timings) as string) & "s")
end tell

return {timings, lstRates, lstRateCnt, lstPlays, lstSkips}

on write_data(this_text)
set myName to name of (info for (path to me))
set the theLog to ((path to desktop) as text) & myName & " Log.txt"
try
open for access file theLog with write permission
write (this_text & return) to file theLog starting at eof
close access file theLog
on error
try
close access file theLog
end try
end try
end write_data

No comments: