Big Data, memes, information diffusion in online social networks and opinion dynamics

Show simple item record

dc.contributor Ramasco Sukia, José Javier
dc.contributor.author Luque Sánchez, Álvaro
dc.date 2023
dc.date.accessioned 2024-10-04T08:26:04Z
dc.date.available 2024-10-04T08:26:04Z
dc.date.issued 2023-10-06
dc.identifier.uri http://hdl.handle.net/11201/166266
dc.description.abstract [eng] Online Social Networks are a key source of information when it comes to human interactions, due to their extended use in contemporary society. In this work a weighted, directed network was built using Twitter replies data from three different countries with varying population sizes during an eight-year-long observation window. Once the network is built, community detection methods are applied to find densely connected clusters of users. Said communities are then studied from a thematical point of view, using hashtags as memes through which members within a community share ideas and common interests. A statistical study is conducted on the variety and repetition rate of hashtags inside communities, as well as quantifying similarities between pairs of groups. The objective is to test if online communication through hashtags in Twitter follows two trends; first, if the growth on the number of unique hashtags as a function of community size follows a well-known law for written texts called Heap’s Law, in which the number of unique words grows as a sublinear function with respect to text length; second, if the behavior in hashtag use of such groups has a boundary around the value of SD = 150 members, which has been believed to be the limit of stable social relationships a human being is able to maintain, following the ideas of Robert Dunbar: below this threshold, communities should behave more similarly to close acquaintances in real life, exhibiting a wide range of topics that are repeated less, in contrast to big groups which should represent communities that aggregate users that follow a certain topic, thus exhibiting a higher repetition rate. From the second idea also follows that there should be more nonzero values of similarity for pairs of small groups, since covering a larger amount of hashtags with less repetition should lead to some overlap in their covered topics, differently from pairs of big communities, which should show a large amount of zero similarity values due to their peaked hashtag distribution around certain topics. In the first place, the vast amount of data that was gathered carried a high computational cost of obtaining the desired metrics and forced to sample a small amount of communities for each of the countries. Moreover, data from the most populated country had to be left out as a result of their dimensionality. As for the hypotheses, the growth of unique hashtags as a funcion of group size was confirmed, although said curve doesn’t resemble Heap’s Law, with such growth being significantly low. Then, the separation that follows from Dunbar’s results reveals that indeed the repetition rate for more populated groups of users grows with respect to that of small groups. Finally, the similarity measure that was implemented for this work doesn’t yield very illuminating results to test our hypothesis, mainly as a consequence of the community sampling that was conducted and the little data processing that was done over hashtags. In a future work, the implementation of this work must be improved to tackle problems such as lemmatization, intruder hashtags that don’t belong naturally to their respective networks or computational efficiency by using Big Data tools that relieve the cost of handling several millions of data entries. Moreover, we propose a network of communities whose link weights are proportional to their similarity, to which we can apply a propagation model for hashtags that emulates the transmission of specific topics in such a network. ca
dc.format application/pdf
dc.language.iso eng ca
dc.publisher Universitat de les Illes Balears
dc.rights all rights reserved
dc.rights info:eu-repo/semantics/openAccess
dc.subject 542 - Química pràctica de laboratori. Química preparativa i experimental ca
dc.subject.other Complex Networks ca
dc.subject.other Big Data ca
dc.subject.other Online Social Networks ca
dc.subject.other Community structure ca
dc.subject.other Hashtags ca
dc.title Big Data, memes, information diffusion in online social networks and opinion dynamics ca
dc.type info:eu-repo/semantics/masterThesis ca
dc.type info:eu-repo/semantics/publishedVersion
dc.date.updated 2024-05-03T09:14:20Z


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search Repository


Advanced Search

Browse

My Account

Statistics