As the competition between technology corporations becomes fiercer with each passing day, companies are trying to get the support of the research community. The impressive machine learning dataset released by Yahoo can surely prove this point. The announcement was made this morning through a blog post of the company.
So what does the 13.5 TB dataset contain? The information is focused on the interaction between Yahoo users over the course of four months, from February 2015 to May 2015. It seems 20 million users have accessed the Yahoo websites, including its homepage, Yahoo News, Yahoo Finance, Yahoo Real Estate, Yahoo Sports and more. Researchers will be able to conduct studies using the new data that refers to device information, key phrases and the age, gender and summary of users. Thus they will be able to find certain patterns in the field of machine learning.
The director of personalization science of Yahoo Labs, Suju Rajan, stated that such data is crucial in research projects of machine learning. Unfortunately, this type of information was only provided to data scientists and machine learning researchers who work at big companies. This will change with the impressive machine learning dataset released by Yahoo, which will be available to academic researchers as well. Furthermore, freelancers will also have access to the it. This will truly continue with important discoveries, as scientists can finally test the theories they have known for so long.
For the moment, we know that some of those who will use this data are from the San Diego based University of California, the Data Science Center of UMass Amherst and the Carnegie Mellon University. Their knowledge areas such as information retrieval, machine learning and artificial intelligence will surely be improved. On Yahoo’s part, the company wants to better understand processes like computational advertising and search ranking, apart from the ones listed above.
Yahoo is not the first corporation to attempt to help the research field. Google, Amazon Machine Learning, IBM Watson and Azure Machine Learning are all working hard to improve their services as well as increase their profits. The announcement of Yahoo also made reference to the purpose of their release: inspiring scientists and researchers to prove their previous theories by testing them in a practical environment.
The impressive machine learning dataset released by Yahoo can be accessed at the Yahoo Webscope website here: https://webscope.sandbox.yahoo.com. It seems technology advancement never takes a break, as every day we receive new information on its progress. With this particular new tool, researchers will surely stay busy for the next period.
Image Source: 1