Sentiment Analysis of National Tourism Organizations on Social Media

: Social media is probably currently the largest source of human-generated text content. User opinions, feedback, comments, and criticism points to their mood and sentiment towards different topics, especially destinations, products or services. The rapid rise in amount of data and constantly generated content require the need to automate both data acquisition and processing to identify important information and knowledge. Sentiment analysis provides the opportunity to detect opinion, feeling and sentiment from unstructured texts on social media. To analyze the sentiment Machine Learning with Google Natural Language API Client Libraries and Google Cloud SDK (Software development kit) was used. NTOs (National Tourism Organizations) social media have been chosen for analysis in which emotional messages can be expected to stimulate potential visitors to the destination. It was found that all selected NTOs add mostly positive posts and in the sample of two hundred contributions there are only seven with negative polarity of sentiment. There was a moderate correlation between customer growth and positive polarity in the contribution. The results show that creating stable positive descriptions for posts can be one of the key variables for the growth of the fan base and stimulation of potential visitors.


Introduction
The huge increase in popularity of big data on social media and social media in general makes it possible for the general public to express their opinions on a wide range of areas such as the state of the economy, enthusiasm or disappointment with a particular product, or to show pleasure in making a purchase (Shayaa et al. 2017).
Internet content has changed over the last few years, and so has the distribution of content. The shift was made possible by new technologies, new media and new communication tools. This change has led to a massive expansion of social media and a huge increase in short informal messages that are accessible to the public (Nakov et al. 2016). With the increasing number of different blogs, groups, forums, and social media as a source of online communication, it is now possible to analyze a huge amount of data that shows mood and feelings (Lyu and Kim 2016). Properly analyzing this data can significantly improve business efficiency (Hedvičáková and Král 2019).
Content on all possible social media is spreading very quickly just because users can freely share whatever they want. This is one of the reasons why the social media posts may have millions of reactions in a few days (Yoo, Song, and Jeong 2018). One of the ways marketers are currently trying to exploit this content and the response of users is sentiment analysis. Sentiment analysis is the use of natural language processing and text mining to identify and extract subjective information from source materials, most often texts (Xiang et al. 2017;Alaei, Becken, and Stantic 2017). Finding these opinions can play a key role in understanding user, consumer or voter behavior.
This fact has recently brought great interest in analyzing sentiment and gathering opinions mainly from the text. Most current methods and approaches to sentiment analysis mainly use a number of positive and negative words in the text to reveal the polarity of sentiment (Dridi and Reforgiato Recupero 2019). Sentiment analysis is used to gather opinions from different types of data (text, video or audio). It is most often text analysis that is frequently used for different areas of marketing, customer needs and customer service, including tourism. The mood or relationship of the customer or users in the examined articles is mainly analyzed by texts such as contributions, ratings and comments.
There are countless opinions on social media that express users' response to events, friend activities, products, or services. While these messages are short, such as on Twitter, up to 140 characters, the short messages can be used to identify a person's mood and feelings through classification, sentiment analysis, or machine learning. (Gaspar et al. 2016).
Most of the data available on social media is unstructured (Gaspar et al. 2016). According to them, approximately 80% of the total data in the world is unstructured, which makes this data difficult to analyze and extract valuable information from it.
Two very important techniques that can help detect emotions and opinions from data available on social media are: • Sentiment analysis • Opinion mining If the mood and feelings of users are correctly detected, these findings can help solve problems in many areas such as elections, public opinion, advertising, marketing, healthcare, public satisfaction and, of course, tourism (Ahmed, Tazi, and Hossny 2015). They highlight, that one of the problems with sentiment analysis is that sometimes it is difficult to tell from the text what emotion the user is trying to convey. They further state, that sentiment analysis can also be used to define trust on social media for a brand or service.
Traditional statistical analyses and methods are not always suitable for big data analysis, precisely because the data is often unstructured. Methods associated with Big Data analysis unify specific tools to find similarities and patterns in large volumes of data. These methods include natural language processing, artificial intelligence, data mining or predictive analysis (Kirilenko et al. 2018).
Social media such as Twitter, Facebook and Chinese Weibo are the ultimate platforms for users to share comments, experience about a product, service, or even the entire business. Businesses are constantly trying to expand their offline and online network (Pochobradská and Marešová 2018). These social media are priceless for those who focus and understand the sentiment of the public (Wang et al. 2016). The main goal in case of sentiment analysis is to correctly classify user generated content (usually text) into either positive or negative polarity (Dhaoui, Webster, and Tan 2017). This was the main impulse for the creation of the sentiment analysis and the subsequent use of this analysis on social media. Sentiment analysis is an important research field today, which still faces many challenges due to the typical structure of social media and microblogs. These contributions are typically very short and full of noise unlike conventional texts such as newspapers (Dridi and Reforgiato Recupero 2019).
It is a bit surprising that even though sentiment analysis is increasingly being used for various purposes, it does not have as much attention among scientists for the overall use of sentiment analysis as an online marketing tool (Rambocas and Pacheco 2018). Over the past few years, sentiment analysis or online feedback have been increasingly used posts. Sentiment analysis uses the principles of natural language processing to identify attitudes and opinions about a specific product, description, or value. Thanks to the huge amount of information and data on the Internet, manual evaluation of sentiment is not a suitable option. Automating the process of collecting and evaluating data is the only practical solution to determine usable opinion from data available on the Internet. These evaluated data can then be used to improve decision making. Improved decision-making on sentiment analysis can be beneficial in many areas including financial market, marketing, e-commerce, politics, law, public decision-making and tourism.
Sentiment analysis has become a standard component of the social media analysis toolkit of marketers and customer relation managers in large organizations (Thelwall 2019). For its use from 2014 to 2019, a review study by Drus and Khalid (2019) demonstrates the prevailing lexicon-based approach. In tourism, the analysis of sentiment in social media is used to research the perception and evaluation of the quality of tourist services (Airport Service Quality

Methodology
The purpose of this research was to find out which sentiment NTOs use in their posts in social media and compare this sentiment with followers for the past two years. Facebook was chosen as the focus social medium because it is the dominant social network and contains the largest number of companies trying to promote their products and services. All analyzed posts in this work are from the social network Facebook. NTOs from the countries with the highest visitor numbers of international tourists were selected. According to UNWTO (2019), the top ten most visited countries in the world in 2018 were France, USA, Spain, China, Italy, Great Britain, Germany, Mexico, Thailand and Turkey.
Of the ten countries selected, Thailand is the only country that does not post on Facebook in English, and for this reason, another country has been added, namely Australia.
A Google product and library package called Artificial Intelligence and Machine Learning with Google Natural Language API (Application Programming Interface) Client Libraries and Google Cloud SDK (Software development kit) was used to analyze sentiment.
Algorithm for sentiment analysis is written in language Python in software PyCharm. The sentiment analysis goes through the inserted text and identifies the emotional opinion in the text, to reveal the author's opinion as either positive, negative or neutral. The sentiment analysis determines what polarity prevails in the text. Output variables from this method are score and magnitude.
The variable score points to the overall emotional state of the inserted text. Magnitude then shows how much emotional content is present in the text. The higher the magnitude, the usually higher the number of positive and negative words in the text. The Natural Language API shows the differences between positive and negative emotions in the selected text, but does not show specific emotions. For example, "annoyed", "sad" or "disappointed" are considered negative emotions. If the score of the pasted text is around zero (neutral), it may mean text with a low number of words expressing emotions, or it may mean mixed emotions with both positive and negative words in the text. Authors used magnitude values to uncover these cases, as truly neutral documents will have a low magnitude value, while mixed documents will have higher magnitude values. The example below shows some sample values and how to interpret them: • Score 0.9 and magnitude 4.2 (Clearly Positive) • Score -0.7 and magnitude 3.8 (Clearly Negative) • Score 0.2 and magnitude 0.1 (Neutral) • Score 0.0 and magnitude 5.4 (Mixed) The original research plan consisted of analyzing the posts that are a clickable article (that is, not to analyze posts that are a video or image) and the text on the social media for the post, and then reanalyzing the sentiment in the article after clicking the post. After examining hundreds of contributions from selected NTOs, this procedure has been changed to study text on social media only, since most NTOs did not add ten posts in 2019 as an article. They focus mainly on photos and videos.
The text used to analyze sentiment will be twenty posts from each Facebook social network for each national tourism organization (these posts are usually shared in the same form on Twitter). The text used on social network will be analyzed. Total of 200 posts from 10 selected NTOs will be tested. Every character from post will be analyzed. Only links will be excluded from the sentiment analysis as they have no value for this test.
The following research question will be asked: Does the polarity of sentiment analysis affect subscriber growth on social media?
Pearson correlation in statistical software IBM SPSS Statistics will be used for statistical analysis.

Results
To determine the sentiment used in selected national tourism organizations, a total of 200 posts were analyzed. Analyzed was text attached to post on social media. Posts were selected using the random selection method from all posts inserted in 2019 and are mostly image or video posts. These 200 posts were selected from a total of almost 2,500 uploaded posts.
One very positive, very negative and neutral short text will be presented to demonstrate the functionality of the algorithm.

Clearly Positive
Text: "The product is amazing! It solved my problem and I highly recommend it!" Output: Sentiment: 0.9, magnitude 1.9 Clearly Negative Text: "The movie was awful, the performances were terrible, and even the music was not very good." Output: Sentiment: -0.9, magnitude 1.8 Neutral Text: "The book is well written even though it has weaker passages, the main character is awesome too.  The table shows that all selected NTOs add positive polarity contributions on average. Of the two hundred contributions analyzed, only seven had negative polarity. Specifically, one weak negative contribution for Spain with a polarity of -0.1, two negative contributions for China with a polarity of -0.2, one negative contribution for Mexico with a polarity of -0.4, one negative contribution for Turkey with a polarity of -0.1, and two negative contributions for Australia with polarity -0.3 and -0.1.
Average of sentiment analysis score is 0.49 and average of magnitude is 1.03 across all analyzed 200 posts and all NTOs.
France had the highest fan growth over the last two years, with an increase of 30% and a sentiment value of 0.58, which is above average. In the second place in the growth of the number of fans is Turkey with an increase of 24% over the last two years and an average sentiment value of 0.75, which is also above average.
Other findings when examining posts include that each NTO has its own style of posting and is more or less adhering to it. For instance, France will never forget to put a positive word in its contributions, and because of this, the value of sentiment polarity is one of the highest. Social contributions in the case of France are usually shorter in two lines. In contrast, the US has contributions longer, often around four lines. Spain posts are in two languages and contain a higher number of hashtags. China and Australia often make use of social Instagram posts from other users. Italy is very active on social media and adds several posts per day. On the other hand, Turkey is not so active and will only add a few posts a month. Great Britain almost always puts emoticons in the post.
Furthermore, the Pearson correlation coefficient for these variables was found, including the increase in the number of fans over the last two years, sentiment analysis score, magnitude, and the number of characters in the text, including spaces. Table 2. Correlation coefficients for increase in fans over the last two years, sentiment analysis score, magnitude, and the number of characters in the text.

Correlations
Increase The table shows the correlation between the increase in a fan base and the sentiment polarity value of 0.451. This correlation is referred to as weak to moderate. Sentiment polarity of posts can therefore be one of the key variables for the growth in the number of social media fans, as this is the first thing that should interest users. It can also be seen from the correlations that magnitude and the number of characters in such short texts do not affect the growth of a fan base.

Discussion
Social media are, as confirmed by the findings of a number of authors (e.g. Wang et al. 2016;Dhaoui, Webster, and Tan 2017), intensively used not only in the NTO's marketing but in majority of industries. The dominant social medium in the NTOs' marketing of the ten most visited countries is Facebook (Hruška and Pásková 2018).
This paper focused on the analysis of sentiment on social media posts and linking the polarity of sentiment to the growth of the fan base. According to the correlation, it was found that the polarity of sentiment has moderate influence on the growth in the number of subscribers. Of course, a number of other factors, such as the quality of posts, the quality of shared images and videos, could influence subscriber growth. The political situation, nationality, the size of the country, marketing and investment in tourism are also important (Ahmed, Tazi, and Hossny, 2015;Drus and Khalid 2019). Nonetheless, this research highlights the importance of the polarity of sentiment in contributions. The main social network where this research was conducted is Facebook, but NTOs often share posts from Facebook in the same wording on Twitter and sometimes Instagram (this is consistent with the results published by Hruška and Pásková 2018). It is therefore possible that the same conclusions will apply to other social media.
All countries add mostly positive contributions and out of 200 tested contributions only seven had negative sentiment. Another finding was a moderate correlation between the sentiment value in the post and the increase in the number of subscribers over the last two years. These findings suggest that, although the number of fans is certainly influenced by many variables, the sentiment of posts could play a very important role in the growth of social network accounts.
The method of analysis performed in this work is not suitable for detecting sentiment polarity from videos, but this is likely to be a development in the future. Social media users usually do not want to read anything, so there could be another focus on analyzing video sentiment, on YouTube, for example.
The limits of this work are that it targets only one factor (text added to social media post and its sentiment) and does not take into account other factors such as video quality or shared image quality. Moreover, the analysis was carried out on a relatively small sample of two hundred contributions. The further research could consist of analyzing a larger amount of posts, including the text inside the post, after clicking, and thus analyzing sentiment on longer texts in areas other than tourism. It would be advisable to use a grounded theory approach (Lai, 2015) for further research methodology with consistent definition of research objectives, type of social media, sample size and results interpretation.

Conclusions
This work analyzed the sentiment of social media posts on Facebook for countries (NTOs) with the most international tourists. The result of this work is finding that there is a moderate correlation between sentiment and growth in the number of fans of NTOs social media.
Concerning the most successful NTOs as for the increase in subscribers over the last two years on Facebook, France has the highest subscriber growth (30%) in the last two years, a sentiment score of 0.58, and its average number of characters in the post description is 105. In comparison, the average sentiment of all selected countries and posts is 0.49. Turkey came second with 24% subscriber growth, an even higher sentiment value of 0.75, and an average post length of 227 characters. China ranked third in the number of subscriber growth, with a lower sentiment value of 0.38, and an average post length of 170 characters, but also a smaller subscriber growth (18%). The research also found that France has added most of the posts that contain an article among all the selected countries. The other selected NTOs usually only add a combination of video and pictures. Another finding was that the last four countries in subscriber increases have less than the average sentiment value. Observation of accounts of selected NTOs has also revealed that they share user generated content (typically an Instagram photo or a YouTube fan video) on its social media, especially on Facebook. Selected NTOs usually add their own text to such posts, and so this factor should not have a big impact on the analysis in this paper.