The analysis of microblogging data related with stock mar- kets can reveal relevant new signals of investor sentiment and attention. It may also provide sentiment and attention indicators in a more rapid and cost-effective manner than other sources. In this study, we created several indicators using Twitter data and investigated their value when model- ing relevant stock market variables, namely returns, trading volume and volatility. We collected recent data from nine ma jor technological companies. Several sentiment analy- sis methods were explored, by comparing 5 popular lexical resources and two novel lexicons (emoticon based and the merge of all 6 lexicons) and sentiment indicators produced using two strategies (based on daily words and individual tweet classifications). Also, we measured posting volume associated with tweets related to the analyzed companies. While a short time period is considered (32 days), we found scarce evidence that sentiment indicators can explain these stock returns. However, interesting results were obtained when measuring the value of using posting volume for fit- ting trading volume and, in particular, volatility.; This work is funded by FEDER, through the program COM- PETE and the Portuguese Foundation for Science and Technology (FCT)...
Dissertação de mestrado em Information Systems Engineering and Management; The analysis of microblogging data may disclose relevant signals of
investor sentiment and attention that can be useful to model and predict
stock market variables (Bollen, Mao, & Zeng, 2011; Mao, Counts, &
Bollen, 2011; Oh & Sheng, 2011; Sprenger & Welpe, 2010). Moreover,
microblogging data can provide sentiment and attention indicators in a
more rapid and cost-effective manner than traditional sources (e.g., large
In this project, we assessed the information content of microblogging data
for explaining stock market variables. We created several indicators using
Twitter data from nine major technological companies and analyzed their
value when modeling returns, trading volume and volatility. Sentiment
indicators were produced by exploring 5 popular lexical resources and two
novel lexicons (emoticon based and the merge of all 6 lexicons) while
attention indicators were based on the posting volume.
Despite the short period analyzed (32 days), interesting results were
obtained when measuring the value of using posting volume for fitting
trading volume and volatility. However, we found scarce evidence that
sentiment indicators can explain stock returns.; A análise de dados de microblogging pode revelar sinais relevantes do
sentimento e atenção do investidor que podem ser úteis para modelar e
prever variáveis do mercado de ações (Bollen et al....
No presente artigo defendemos uma mudança para um paradigma da comunicação orientado para a sociabilização, baseado em plataformas de software social e conteúdo criado pelo utilizador comum. Pretendemos reflectir sobre o panorama das redes sociais na Internet e dos seus suportes, equacionando a emergência de uma nova sociabilidade. Assumindo que a nova sociedade tem por base a exclusão do determinismo territorial e pode operacionalizar (nas sociedades ditas “info-incluídas”) uma divisão social e cultural de indivíduos, a proposta que apresentamos pretende contribuir para um quadro teórico com reflexões sobre a emergência de uma nova sociabilidade desterritorializada, assente num modelo de comunicação que está em permanente mutação, transformou o conceito de utilizador em “Consumer 2.0” e criou a possibilidade do receptor ser agora produtor para uma audiência global – estaremos no início da era dos “prosumers” em larga escala? A nossa hipótese de trabalho é a de que os ambientes em rede (com base na inteligência colectiva e na acção social) promovem um novo tipo de cidadania e, consequentemente, novas relações e práticas sociais. A proposta é, recorrendo à análise de redes sociais, estudar mapas de conversações (baseadas no conteúdo e nas interacções entre utilizadores) no serviço de microblogging Twitter através de hashtags (palavras-chave) relacionadas com acontecimentos de escala mundial e desenhar redes sociais baseadas em "folksonomy" (“social tagging”)...
Sentiment analysis has been increasingly applied to the stock market domain. In particular, investor sentiment indicators can be used to model and predict stock market variables. In this context, the quality of the sentiment analysis is highly dependent of the opinion lexicon adopted. However, there is a lack of lexicons adjusted to microblogging stock market data. In this work, we propose an automatic procedure for the creation of such lexicon by exploring a large set of labeled messages from StockTwits, a popular financial microblogging service, and using four statistical measures: adaptations of the known TF-IDF, Information Gain, Class Percentage, and a newly proposed Weighted Class Probability. The obtained lexicons are competitive when compared with a set of six reference lexicons. Moreover, we verified that it is beneficial to use continuous sentiment scores instead of sentiment labels.
Microblogging in the workplace as a functionality of Enterprise Social Networking (ESN) platforms is a relatively new phenomenon of which the use in knowledge work has not yet received much attention from research. In this cross-sectional study, I attempt to shed light on the role of microblogging in knowledge work. I identify microblogging use practices of knowledge workers on ESN platforms, and I identify its role in supporting knowledge work performance. A questionnaire is carried out among a non-representative sample of knowledge workers. The results shed light on the purposes of the microblogging messages that knowledge workers write. It also helps us find out whether microblogging supports them in performing their work. The survey is based on existing theory that supplied me with possible microblog purposes as well as theory on what the actions of knowledge workers are. The results reveal that “knowledge & news sharing”, “crowd sourcing”, “socializing & networking” and “discussion & opinion” are frequent microblog purposes. The study furthermore shows that microblogging benefits knowledge workers’ work. Microblogging seems to be a worthy addition to the existing means of communication in the workplace, and is especially useful to let knowledge...
Large scale analysis and statistics of socio-technical systems that just a few short years ago would have required the use of consistent economic and human resources can nowadays be conveniently performed by mining the enormous amount of digital data produced by human activities. Although a characterization of several aspects of our societies is emerging from the data revolution, a number of questions concerning the reliability and the biases inherent to the big data “proxies” of social life are still open. Here, we survey worldwide linguistic indicators and trends through the analysis of a large-scale dataset of microblogging posts. We show that available data allow for the study of language geography at scales ranging from country-level aggregation to specific city neighborhoods. The high resolution and coverage of the data allows us to investigate different indicators such as the linguistic homogeneity of different countries, the touristic seasonal patterns within countries and the geographical distribution of different languages in multilingual regions. This work highlights the potential of geolocalized studies of open data sources to improve current analysis and develop indicators for major social phenomena in specific communities.
Cumulative effect in social contagion underlies many studies on the spread of innovation, behavior, and influence. However, few large-scale empirical studies are conducted to validate the existence of cumulative effect in information diffusion on social networks. In this paper, using the population-scale dataset from the largest Chinese microblogging website, we conduct a comprehensive study on the cumulative effect in information diffusion. We base our study on the diffusion network of message, where nodes are the involved users and links characterize forwarding relationship among them. We find that multiple exposures to the same message indeed increase the possibility of forwarding it. However, additional exposures cannot further improve the chance of forwarding when the number of exposures crosses its peak at two. This finding questions the cumulative effect hypothesis in information diffusion. Furthermore, to clarify the forwarding preference among users, we investigate both structural motif in the diffusion network and temporal pattern in information diffusion process. Findings provide some insights for understanding the variation of message popularity and explain the characteristics of diffusion network.
The microblogging is prevailing since its easy and anonymous information sharing at Internet, which also brings the issue of dispersing negative topics, or even rumors. Many researchers have focused on how to find and trace emerging topics for analysis. When adopting topic detection and tracking techniques to find hot topics with streamed microblogging data, it will meet obstacles like streamed microblogging data clustering, topic hotness definition, and emerging hot topic discovery. This paper schemes a novel prerecognition model for hot topic discovery. In this model, the concepts of the topic life cycle, the hot velocity, and the hot acceleration are promoted to calculate the change of topic hotness, which aims to discover those emerging hot topics before they boost and break out. Our experiments show that this new model would help to discover potential hot topics efficiently and achieve considerable performance.
peer-reviewed; Microblogging is one of the popular forms of social media that has
quickly permeated both enterprise and open source communities. However, how
exactly open source communities can leverage microblogging is not yet well
understood. We investigate how Drupal’s open source community uses Twitter, a
household-name in microblogging. Our analysis of group and individual accounts of
Drupal developers reveals that they take on similar but distinct roles. Both serve as
communicators of essential links to a vast and growing community knowledge base,
such as work artifacts, issues, documentation, and blog posts. Community members
often express positive emotions when tweeting about work, which reinforces a sense
of community. Finally, Twitter is also used as a crowdsourcing channel to solicit
Este artículo presenta los resultados de
una investigación realizada en un centro
educativo de la provincia de Toledo (España)
sobre la incidencia de actividades con
base en el microblogging escolar mediante
el uso de la red social gratuita “Edmodo”
en el desarrollo de dos competencias básicas:
lingüística, digital y de tratamiento
de la información, en el segundo curso
de Enseñanza Secundaria Obligatoria. La
actividad lingüístico-digital basada en la
colaboración, participación e interactividad
del alumnado y profesorado mediante
mensajes cortos en redes sociales es
un recurso esencial para la conformación
social y lingüística del estudiante de enseñanza
secundaria en el mundo virtual y
ayuda de forma inestimable al desarrollo
transversal e interdisciplinar de las competencias
básicas. Formar al alumnado
en el acceso a herramientas virtuales de
estructuración de la información, le posiciona
con la competencia necesaria para
afrontar el contexto socio-tecnológico actual;
donde el conocimiento se adquiere
por la posibilidad no sólo de leer y estar Microblogging con Edmodo para el desarrollo de las competencias básicas del
alumnado de enseñanza secundaria. Un estudio de caso
RESUMEN: Tras las evoluciones sufridas por la Web, emerge la Web 2.0 y aparecen herramientas
sociales que posteriormente son implementadas por las empresas.
Se estudiarán tres de las aplicaciones de Social Media más utilizadas como son los
Blogs, las Redes Sociales y el Microblogging mediante un estudio sobre las estructuras
de cada una de ellas, los tipos y la utilización por parte de las empresas.
Tras realizar el estudio teórico se observarán los Blogs, las Redes Sociales (Tuenti y
Facebook) y el Microblogging (Twitter) de las principales empresas de telefonía móvil
es España (Movistar, Vodafone, Orange, Yoigo y Simyo).
Una vez observadas las herramientas que posee cada empresa, se analizará si éstas son
usadas correctamente siguiendo unos factores previamente establecidos. De este modo,
se extraerán conclusiones sobre su utilización.
Se analizarán las limitaciones surgidas a lo largo de la realización del estudio de dichas
herramientas y se estudiará las futuras líneas de desarrollo.; ABSTRACT: After the evolutions suffered by the web, the Web 2.0 emerges and appear the social
tools that later are implemented by the companies.
There will be studied the three more used Social Media applications like the Blogs...
In the recent years, microblogging services, as Twitter, have become a popular tool for expressing feelings, opinions, broadcasting news, and communicating with friends. Twitter users produced more than 340 million tweets per day which may be consider a rich source of user information. We take a supervised approach to the problem, but leverage existing hashtags in Twitter for building our training data. Finally, we tested the Spanish emotional corpus applying two different machine learning algorithms for emotion identification reaching about 65% accuracy.; This work was supported in part by Projects MEyC TEC2012-37832-C02-01, MEyC TEC2011-28626-C02-02 and CAM CONTEXTS (S2009/TIC-1485); Proceedings of: 11th Conference on Practical Applications of Agents and Multi-Agent Systems (PAAMS 13). Salamanca, Spain, May 22-24, 2013.
Documenting the context in which data are collected is an integral part of
the scientific research lifecycle. In field-based research, contextual
information provides a detailed description of scientific practices and thus
enables data interpretation and reuse. For field data, losing contextual
information often means losing the data altogether. Yet, documenting the
context of distributed, collaborative, field-based research can be a
significant challenge due to the unpredictable nature of real-world settings
and to the high degree of variability in data collection methods and scientific
practices of different researchers. In this article, we propose the use of
microblogging as a mechanism to support collection, ingestion, and publication
of contextual information about the variegated digital artifacts that are
produced in field research. We perform interviews with scholars involved in
field-based environmental and urban sensing research, to determine the extent
of adoption of Twitter and similar microblogging platforms and their potential
use for field-specific research applications. Based on the results of these
interviews as well as participant observation of field activities, we present
the design, development, and pilot evaluation of a microblogging application
integrated with an existing data collection platform on a handheld device. We
investigate whether microblogging accommodates the variable and unpredictable
nature of highly mobile research and whether it represents a suitable mechanism
to document the context of field research data early in the scientific
information lifecycle.; Comment: Proceedings of the 45th Hawaii International Conference on System
Science (HICSS-45 2012)
China has the largest number of online users in the world and about 20%
internet users are from China. This is a huge, as well as a mysterious, market
for IT industry due to various reasons such as culture difference. Twitter is
the largest microblogging service in the world and Tencent Weibo is one of the
largest microblogging services in China. Employ the two data sets as a source
in our study, we try to unveil the unique behaviors of Chinese users. We have
collected the entire Tencent Weibo from 10th, Oct, 2011 to 5th, Jan, 2012 and
obtained 320 million user profiles, 5.15 billion user actions. We study Tencent
Weibo from both macro and micro levels. From the macro level, Tencent users are
more active on forwarding messages, but with less reciprocal relationships than
Twitter users, their topic preferences are very different from Twitter users
from both content and time consuming; besides, information can be diffused more
efficient in Tencent Weibo. From the micro level, we mainly evaluate users'
social influence from two indexes: "Forward" and \Follower", we study how
users' actions will contribute to their social influences, and further identify
unique features of Tencent users. According to our studies, Tencent users'
actions are more personalized and diversity...
In microblogging, hashtags are used to be topical markers, and they are
adopted by users that contribute similar content or express a related idea.
However, hashtags are created in a free style and there is no domain category
information about them, which make users hard to get access to organized
hashtag presentation. In this paper, we propose an approach that classifies
hashtags with news categories, and then carry out a domain-sensitive popularity
ranking to get hot hashtags in each domain. The proposed approach first trains
a domain classification model with news content and news category information,
then detects microblogs related to a hashtag to be its representative text,
based on which we can classify this hashtag with a domain. Finally, we
calculate the domain-sensitive popularity of each hashtag with multiple
factors, to get most hotly discussed hashtags in each domain. Preliminary
experimental results on a dataset from Sina Weibo, one of the largest Chinese
microblogging websites, show usefulness of the proposed approach on describing
hashtags.; Comment: 2 pages, no figure, to be appeared on RCIS 2015
Online microblogging services that have been increasingly used by people to
share and exchange information, have emerged as a promising way to profiling
multimedia contents, in a sense to provide users a socialized abstraction and
understanding of these contents. In this paper, we propose a microblogging
profiling framework, to provide a social demonstration of TV shows. Challenges
for this study lie in two folds: First, TV shows are generally offline, i.e.,
most of them are not originally from the Internet, and we need to create a
connection between these TV shows with online microblogging services; Second,
contents in a microblogging service are extremely noisy for video profiling,
and we need to strategically retrieve the most related information for the TV
show profiling.To address these challenges, we propose a MAP, a
microblogging-assisted profiling framework, with contributions as follows: i)
We propose a joint user and content retrieval scheme, which uses information
about both actors and topics of a TV show to retrieve related microblogs; ii)
We propose a social-aware profiling strategy, which profiles a video according
to not only its content, but also the social relationship of its microblogging
users and its propagation in the social network; iii) We present some
In this paper, we propose a Concept-level Emotion Cause Model (CECM), instead
of the mere word-level models, to discover causes of microblogging users'
diversified emotions on specific hot event. A modified topic-supervised biterm
topic model is utilized in CECM to detect emotion topics' in event-related
tweets, and then context-sensitive topical PageRank is utilized to detect
meaningful multiword expressions as emotion causes. Experimental results on a
dataset from Sina Weibo, one of the largest microblogging websites in China,
show CECM can better detect emotion causes than baseline methods.; Comment: 2 pages, 2 figures, to appear on WWW 2015
This paper proposes a new microblogging architecture based on peer-to-peer
networks overlays. The proposed platform is comprised of three mostly
independent overlay networks. The first provides distributed user registration
and authentication and is based on the Bitcoin protocol. The second one is a
Distributed Hash Table (DHT) overlay network providing key/value storage for
user resources and tracker location for the third network. The last network is
a collection of possibly disjoint "swarms" of followers, based on the
Bittorrent protocol, which can be used for efficient near-instant notification
delivery to many users. By leveraging from existing and proven technologies,
twister provides a new microblogging platform offering security, scalability
and privacy features. A mechanism provides incentive for entities that
contribute processing time to run the user registration network, rewarding such
entities with the privilege of sending a single unsolicited ("promoted")
message to the entire network. The number of unsolicited messages per day is
defined in order to not upset users.
Microblogging services such as Twitter are an increasingly important way to
communicate, both for individuals and for groups through the use of hashtags
that denote topics of conversation. However, groups can be easily blocked from
communicating through blocking of posts with the given hashtags. We propose
#h00t, a system for censorship resistant microblogging. #h00t presents an
interface that is much like Twitter, except that hashtags are replaced with
very short hashes (e.g., 24 bits) of the group identifier. Naturally, with such
short hashes, hashtags from different groups may collide and #h00t users will
actually seek to create collisions. By encrypting all posts with keys derived
from the group identifiers, #h00t client software can filter out other groups'
posts while making such filtering difficult for the adversary. In essence, by
leveraging collisions, groups can tunnel their posts in other groups' posts. A
censor could not block a given group without also blocking the other groups
with colliding hashtags. We evaluate the feasibility of #h00t through traces
collected from Twitter, showing that a single modern computer has enough
computational throughput to encrypt every tweet sent through Twitter in real
time. We also use these traces to analyze the bandwidth and anonymity tradeoffs
that would come with different variations on how group identifiers are encoded
and hashtags are selected to purposefully collide with one another.; Comment: 10 pages...