General Category => Everything Else => Topic started by: atomic7732 on October 09, 2017, 07:42:08 PM
Title: IRC science
Post by: atomic7732 on October 09, 2017, 07:42:08 PM
so i wrote a script that counts the uses of certain laughing acronyms cause i wanted to see how the usage of kol and lmao evolved through time on irc and got this using mine (April 2015 onward) and dar's (pre-April 2015) logs...
i wrote the script mostly with this laugh count in mind, so my counting method is effectively the number of lines in which a phrase appears per month. it is not the total occurrences in the logs, because multiple occurrences in the same line, say "kol kol" would only count for one kol. this mitigates the effect of spam. if multiple laughs (or phrases) appear in the same line, they are both counted once. phrases also only show up when they are separated by spaces or at the end of a line, so "lmaoooo" does not count for lmao, and "kolkol" does not count for kol. this is to mitigate false positives such as in words that just happen to contain "kol" or "lol", granted, that's probably not much in english, but it applies to later searches i did. words at the beginning of a line have a space before them in the logs, so they are counted.
it's interesting to note that lmao appeared once and took over a year to actually start chipping away at the usage of kol, and during its first peak, it actually tended to displace lol.
after seeing these interesting results i decided... hey we could use this to track other memes and linguistic trends, so me and fiah thought up some words and we got this
apparently my renaming as kalassak ended the fight between "atomic" "atomic7732" and "nue" really abruptly
these are not nick usages, they are mentions, because as noted before, search phrases must have spaces around them. the logs put <> around nicks when people say things.
i did have to fix this to remove <iris-discord> and timestamps and nicks and such... i just used a simple algorithm assuming that people don't include the character ">" in their messages that much, which should be fairly reasonable. it just counts all characters after the last occurrence of ">"
Title: Re: IRC science
Post by: Darvince on November 17, 2018, 10:59:35 PM
the only time i can think of when > was included a lot was when we would say things like '>time is real'
Title: Re: IRC science
Post by: atomic7732 on November 17, 2018, 11:05:48 PM
yeah which only removes one character anyway
Title: Re: IRC science
Post by: Darvince on November 18, 2018, 03:11:50 AM
bees
Title: Re: IRC science
Post by: atomic7732 on November 18, 2018, 10:34:57 PM