Subsonic

The question of “How do I access my music collection over the Internet” has a few good solutions. I previously talked about Ampache, which is quite good but gave me some trouble due to the extreme size of my media library. Ampache had to be tweaked to allow an unlimited execution time and higher memory limits in php.ini. While that is easy enough, it never finished cataloging all of my music. While it completed most of it and I was mostly happy with it I am not a “mostly” kind of guy. The hunt was on for something better.

Subsonic is a standalone media streaming application focused around music and built on Java. It has almost total feature overlap with Ampache in that they both have cover art display, tag editing, multiple user support, transcoding, and more. After the library import, they both use around 550MB of active memory with my library. Unlike Ampache, subsonic has a smartphone app that you can use to stream to your mobile device.

The install for subsonic is much easier and has far fewer steps, as the HTTP server and all other necessary software is included in the RPM or DEB. I made an Ubuntu Server VM and, after updating, I just had to navigate to the download page and choose the correct option for the OS. In this case a DEB. SSH into the VM and wget the download link to get it on the local host. Then it is just a matter of dpkg -i subsonic.deb. The install is complete at this point. It’s worth noting though that, unless you wrangle with it, it wants to run as root by default. So it’s best to carve out a whole VM or physical machine dedicated to just subsonic. One more thing to do before you are done configuring the VM is to attach your storage. You might want to dump your music to the VM locally, assuming you made a big enough disk. Since I already have a storage server I just exported an NFS share and mounted it to /mnt in the VM.

After the install your server will be running on that host at port 4040. After pulling up that location in the browser you get a glimpse of the web interface:

subsonic0

After you have changed the password and edited the media folders to access your media you will want to initiate a scan under “Settings” —> “Media folders.” When this is complete your media has been added to the database…but where the heck is it? It took me 20 minutes on the forums to figure this one out. Apparently a newly added option is restricting certain media folders to certain users. So go here:

subsonic1

Check the box corresponding to your media folder name and you are off to the races.

I’ve found the search function to work extremely well, and playlists are easy to make. There is even Last.fm scrobbling if you are into the whole “social music” thing.

Hashing it out

The cryptographic hash function is a really important concept to understand in the world of computing. It plays a fundamental role in many different ways. So what the heck is it? The ELI5 version is that you have a chunk of data. It goes into a complex mathematical algorithm and out the other side comes a string. An example of the output is 326327CC2FF9F05379F5058C41BE6BC5E004BAA7. If the hash function is working correctly, it should be mathematically unfeasible to input any other chunk of data and return that same string out the other side. When I say “mathematically unfeasible” what I really mean is that the computing power necessary to create the same string using a different input would require all of the material in the known galaxy, for instance. While this is true for most of the modern hash functions, some older ones aren’t so secure. MD5 was the old standby for a long time. It’s still non-trivial to exploit, but it has been proven that with enough computing power and the right software an attacker can create the output string they desire. This is called a collision. Collisions are bad because the software that is implementing the hash function almost always needs to be able to trust the output to be known and secure. Being able to manipulate this allows a nefarious party a great deal of power.

An example of this power is the very sophisticated piece of malware called Flame. Flame infected machines by exploiting the MD5 implementation used to verify certificates that signed packages from Windows Update. The upshot is that the computers would search out updates and an infected machine would intervene and say “Hey bro, I have a totally legitimate package for you to install. It’s even signed. Totes legit.” The naive and vulnerable computer would then download and execute that package and one more machine would be totally owned by the attacker. Microsoft no longer uses MD5 due to its susceptibility to collisions.

The amount of computing power necessary to create a collision from MD5 is easily within the reach of even a modestly funded attacker. A small network of machines using GPU calculation can create a custom collision in a matter of days. Since Amazon now sells GPU instances on AWS, a cluster can be rented, brought up, used, and torn down again in record time and for an almost shockingly small sum.

Passwords, for instance, are usually kept in hashed form in a database. Since the frontend that users input their password into should not accept the hash of a password, the end result is that everything is much more hardened against attack. Even if someone from the outside manages to exploit the machine and gain read privileges on the database, all they have is the hash output strings. In order to actually use these to log in they would have to bruteforce the algorithm to get the raw text of the password that was the input for the string. Since hash functions are designed to only go one way, anyone with a moderately decent password would be safe for a substantial period of time. Certainly long enough for the operator of the exploited service to alert users to the breach and have them change their passwords.

But all of this is just an abstraction for those that use technology instead of creating it. Even so it’s useful to know in order to be able to verify your software, for instance. How do you know that the software you just grabbed is indeed legitimate? Many pieces of software list a hash string on the downloads page so that users can verify the authenticity of the package for themselves. This protects against, say, a DNS hijack that tricks you into downloading malware. It also guards against the more common problem of data that didn’t download properly. I like a piece of software called Hashtab to check out the hash strings in Windows. For Linux and Unix systems you can use MD5SUM or sha1sum to get an output from the terminal. Let’s check out our file below:

hashtab

By right clicking on our file and choosing properties we can then get to the File Hashes tab and check out our hash output. The file in question is an ISO from Microsoft. We know that this is a genuine piece of Microsoft software because the SHA sum was published for this Windows 7 RTM disk. We now know that nobody inserted anything malicious into out download and that we have the full and complete file.

Hash isn’t just for the Cheech and Chong set. It can even be used by people with jobs!

nosce te ipsum

In computing, as in life, taking the time to fully understand what you are trying to accomplish is never a waste of time. Sun Tzu clearly knew a great deal about computer hardware and software when he gave us “故曰:知彼知己,百戰不殆;不知彼而知己,一勝一負;不知彼,不知己,每戰必殆.” If you know what you are trying to accomplish and you know what the hardware is capable of then you cannot lose. If you know one and not the other you might as well be flipping a coin. An absence of knowledge of both will certainly lead to a poorly running computer and or inefficiency in cost.

A case in point is a computer used in a family or office setting that will never play games. For this application, modern CPU’s are way beyond the point of being “fast enough” even on the extremely low end. Even going as low as $47 won’t make a difference. Web browsers and office documents are going to so very rarely be CPU bound spending more is going to get you very little in terms of actual performance payoff. Even more inefficient would be spending to the point where you are adding cores. Slotting a quad core into a use-case like this will result in nothing more than double the number of idle cores sitting around twiddling their expensive thumbs all day. Web browsers like Chrome can thread out very well, but we have to be keep in mind that your typical office worker isn’t going to be using the browser in such a way that it can cap out a cheapo dual core. I doubt Joe in Accounting is going to be pulling up 100 tabs of flash based streaming porn. And if he does, that’s more of a management problem than a computing one.

Your time spent understanding the use-case extends perhaps most importantly to storage. Hard drives are going to be the least expensive still. Analyze how much space your users are using currently. Most will likely never have any case to store more than 128GB locally. If you are a large enough business, allowing anyone to store things locally is a bad idea anyway as that creates a single point of failure for data that could very well be of critical importance to your operations. This problem is best handled with a GPO and an Active Directory member server running RAID. Also with storage it’s worth considering if the slight cost increase of going to an SSD is going to payoff. I think in most cases it will for this type of work if we consider how expensive labor is. The combined cost of 20 workers waiting for their computers to boot up or lagging behind as the storage catches up with them is going to cover that initial cost outlay very quickly. Consider the cost delta between a decent spinning disk and a good quality SSD. Even if you somehow pay minimum wage it will still payoff quickly. Other things are less quantifiable such as the reduced stress of having your workers using fast storage. If they are anything like me I know I get frustrated when I have to wait for things to load.

Another thing to consider is size. Since our office and web browsing machines are so low power they don’t need to be very large at all. Machines running our very cheap Intel CPUs and solid state storage should only be able to peg maybe 50 watts of power at full tilt. They will spend 95% of their lives using half that. You can easily get by with using a form factor such as mini-ITX. This will use up much less desk space and, rather informally, look bomb. One I built recently for such a purpose is roughly the size of your typical George R.R. Martin hardcover.

flat_cropped

upright_cropped

This cute little guy is purpose built, dirt cheap, and boots and opens programs with enough speed to really blow your hair back.

Ampache

It seems like most of the world is content on accessing their media through streaming. For any number of reasons, I prefer to keep actual copies of books, TV, movies, and music. Other plebeians seem to be happy renting the media they want for a monthly fee. After acquiring some decent quality IEM’s, I thought it was time to make sure I could access my audio library on the go. While I could pull the files through SSH and access them using, say, SSHFS, it took little time for me to get frustrated with this inelegant solution. I had previously used GNUMP3d on a LAMP machine to pull music over the Internet. That was a great piece of software, and it worked very well. I decided go a different way this time as the project hasn’t been updated since 2007 and there are better alternatives in these halcyon modern times. This led me to Ampache, which is modern, maintained, and open source.

Ampache is a web application that runs on anything that has the necessary basic components. In this case what we need are and HTTP server, PHP for scripting, and a MySQL database server. Since I am a glutton for punishment, I decided to pick the best of each of these components and see if I could hack it all together and make it work. Picking the “best” HTTP server is a largely masturbatory exercise. This is much like arguing the merits of other holy software such as text editors. Going by just the facts you’d have to agree that the fastest web server software is going to be NGINX. If NGINX is fast enough to approach the outer limits of 10,000 concurrent connections, it should be good enough for this use-case. As for PHP, there is little choice there. Happily, there is a bit of choice with our database server. Ever since Larry Ellison and his band of money-grubbing devils sunk Sun Microsystems, MySQL has been tainted. Luckily, mariadb came along to return our precious freedom.

freedom

As far as ampache goes, I have found it to be slick, fast, and intuitive. There are plenty of features buried in there as well. About half of my music is stored as FLAC, which is high bitrate and quite large. I told ampache to transcode this on the fly, and it promptly told me to install ffmpeg. So after that little altercation, I can access my FLAC as more WAN appropriate V4 mp3. The interface is pretty slick. The media player pops up at the bottom as soon an you start playing anything. Playlists can be made and saved on the fly.

ampache1

If you are looking for a solution for sharing your music over the web, I would check it out.