Skip to main content

New Windows Azure Service, The Archivist, Helps you Export, Archive and Analyze Tweets

Posted on 20 July, 2010

Ever wonder what happens to tweets after they disappear?  Would you love a way to keep track of, analyze and even export tweets related to topics you care about?  Enter The Archivist, a new lab/website from Mix Online built on Windows Azure that allows you to monitor Twitter, archive tweets, data mine and export archives. 


As Microsoft Developer and Archivist creator, Karsten Januszewski explains it, "When you start a search using The Archivist, it will create and monitor an archive based on that search that you can later analyze for insights, trends and other useful information, as well as export for further analysis or reporting."


Get an introduction to The Archivist in Karsten's blog post, read about its evolution in a post by Microsoft Designer, Tim Aidlin, or sit back and let Karsten and Tim explain The Archivist to you in the video, "The Archivist: Your friendly neighborhood tweet archiver" on Channel 9.


Why Windows Azure? Januzewski elaborates, "Windows Azure was a perfect fit for the Archivist for three reasons: first, blob storage is ideally suited to store the tweets, we've already archived more than 60 million tweets; second, the ability to use Windows Azure background worker processes to poll Twitter provides crucial functionality; and, lastly, because Twitter is a global phenomenon, Windows Azure enables The Archivist to effectively scale both the download of archives through the CDN as well as the number of web servers required, based on changing traffic patterns."


And a great part of The Archivist is that all the source code is available for download. Not only is the source code licensed so anyone can run (and enhance) their own instance of the Archivist in Windows Azure, it provides a reference architecture for how to take advantage of features in Windows Azure, such as blob storage and background worker processes.