Abstract: The background of the research on recommendation system algorithms for movie websites is mainly that with the development and popularization of the Internet, more and more users are watching movies and TV dramas through online platforms. In order to meet the personalized needs of users, recommendation systems have become one of the core functions of movie websites. By using big data technology and machine learning algorithms, recommendation systems can analyze user behavior data, interests, and viewing habits, in order to recommend movies and TV shows that meet their tastes. With the continuous development of big data technology, movie websites can obtain more user data, such as ratings, comments, viewing history, favorites, etc. These data provide richer information for recommendation systems, enabling them to more accurately predict user preferences and needs. Meanwhile, the continuous progress of machine learning algorithms has also provided more possibilities for recommendation systems, such as collaborative filtering, matrix factorization, deep learning, etc. The background of research on recommendation system algorithms for movie websites also includes the fiercely competitive online video market. In order to attract and retain users, movie websites need to continuously improve the accuracy of recommendation systems and user experience. Therefore, researching more efficient and accurate recommendation algorithms has become an important task for movie websites.
This article designs a movie website recommendation system based on big data. The core function of this system is to crawl the raw data of massive movie ratings, store and calculate the raw data through big data technology, and display the analysis results in a visual list form.
The main research work and achievements of this paper are as follows:
1. A big data based movie website backend software system was designed and successfully developed using software development techniques such as Spring, SpringMVC, MyBatis, Echarts, etc. The movie rating data of this backend system comes from movie ratings crawled on movie websites.
2. We used a web crawler based on Python language to crawl movie ratings from movie websites. Clean the raw data crawled and store it on Hadoop, then use the MapReduce distributed computing programming model to calculate the data, and finally save the results to MySQL for storage and analysis.
Keywords: big data; Hadoop; Python; Movie ratings
">