Twitter bots are here to stay!
Whenever anyone signs up for an account in any website, the user is usually obliged to prove that he or she is not a robot by responding to a CAPTCHA or a ‘Completely Automated Public Turing test to tell Computers and Humans Apart’. It is a security layer that distinguishes human from machine input which can be classified into five: word solving, picture, audio, branded, 3D and math solutions. The primary goal of CAPTCHA is to block spammers and bots trying to automatically harvest email addresses or sign up for or make use of Web sites, blogs or forums.
In most parts, CAPTCHAs are successful preventive measures against harmful bots during signup. But the problem is that humans can sign up for them and the bots could eventually take over – which is allowed in some websites.
Twitter, for example, is aware that there are bots lurking in the social networking site because they have allowed it, but only for purposes of spreading information. This works for companies and entities which rely on such form of automation, but it spells trouble for managing website content. Detecting and regulating these bots could be hard when intentions change to swaying public behaviour.
Abdullah Mueen, assistant professor at the Department of Computer Science in the University of New Mexico, acknowledged this loophole. He said, “Twitter rules allow some bots for informational purposes but doesn’t allow them to be created with the intention of swaying public behavior, but they are being created so quickly and are so hard to detect that they are going undetected.”
To solve this problem, Mueen and students from the said university developed a technology that will assist in detecting the bad bots and preventing new ones from coming. On the process of creating it, they formed a startup called BotAlert Inc.
The researchers started deploying their technology called DeBot. Since then, they have detected about 700,000 bots, and found that there are about 1,500 bots, both legal and not, created in a day around Twitter.
But these bots lurking around Twitter are not all the same. There are the ‘good’ bots like the ones used by CNN Sports or CNN Politics. Their only job is to update news automatically, no longer requiring the human touch in posting tweets. These bots are permitted by Twitter because they are simply sharing news.
On the other hand, there are the ‘bad’ bots which are made to impersonate, hide under a fake identity, and influence public opinion. In online contests wherein the public is asked to choose in a selection, armies of bots can vote for a competing person or group legitimately. This happened in the iHeart Media popular choice contest where bots were involved in massive campaigns.
The political landscape is not at all exempted from this technology, especially that the U.S. presidential elections are underway. Bots mentioning the two presidential candidates, Hillary Clinton and Donald Trump, were monitored by the team through IBM’s Watson open-source analytics. They found that there are twice as many bots mentioning Clinton in a negative light than there are mentioning Trump in a negative light.
The solution Mueen and the students made mostly rely in its detection and correlation of the fake bots’ activities. This is evident on DeBot’s website wherein a gallery of identical tweets by two completely different users can be found.
Mueen explained, “To the user seeing one of these posts, this may not seem suspicious, but our technology has found many examples of highly-correlated Twitter accounts that are retweeting identical content within 10 seconds of each other.
“Of course, two users could not be simultaneously posting the exact same content for hours, so this is naturally suspicious and identifies a bot,” he added.
DeBot proves to be effective so far because it has a system which listens to keywords, indexes and picks up suspicious words, monitors suspicious users, and clusters suspicious users to find highly-correlated user accounts. The technology proves to be better at identifying bots than Twitter suspending accounts, Mueen claimed.
The team looks forward in a more stringent system that can accommodate huge data. Mueen said that Twitter, which has 313 million active users, only provides them 1% of the data, and yet they already detected thousands of bad bots among them. He admitted that this is a huge computational task, but their goal is to be able to understand who is behind bots in order to stop their creation and spread.