Open Access. Powered by Scholars. Published by Universities.®

Digital Commons Network

Open Access. Powered by Scholars. Published by Universities.®

University of Wollongong

Data

2016

Faculty of Law, Humanities and the Arts - Papers (Archive)

Articles 1 - 1 of 1

Full-Text Articles in Entire DC Network

Mining Chinese Social Media Ugc: A Big Data Framework For Analyzing Douban Movie Reviews, Jie Yang, Brian Yecies Jan 2016

Mining Chinese Social Media Ugc: A Big Data Framework For Analyzing Douban Movie Reviews, Jie Yang, Brian Yecies

Faculty of Law, Humanities and the Arts - Papers (Archive)

Analysis of online user-generated content is receiving attention for its wide applications from both academic researchers and industry stakeholders. In this pilot study, we address common Big Data problems of time constraints and memory costs involved with using standard single-machine hardware and software. A novel Big Data processing framework is proposed to investigate a niche subset of user-generated popular culture content on Douban, a well-known Chinese-language online social network. Huge data samples are harvested via an asynchronous scraping crawler. We also discuss how to manipulate heterogeneous features from raw samples to facilitate analysis of various film details, review comments, and …