2020.12.12 05:20 rannar7 DreamWasTaken2
2018.12.09 00:23 icoinformation2021 ResidentEvil2Remake
2016.04.28 18:15 AMG_Jackett Rising Storm 2: Vietnam
2023.03.18 00:00 Top_Weekly_Bot Top post from r/DreamWasTaken2 Thanks to CaptainPuffy for taking the time to put this together! You can check out the photos at puffy.gg Mar 17 2023
|submitted by Top_Weekly_Bot to SubredditTopWeekly [link] [comments]|
2023.02.17 03:04 tehnoob69 I did it
|submitted by tehnoob69 to shitposting [link] [comments]|
2023.01.26 09:11 Aiso202 😭😭😭😭😭😭😭😭
|submitted by Aiso202 to shitposting [link] [comments]|
2023.01.11 06:05 God689Windows LMFAO
submitted by God689Windows to DreamWasTaken2 [link] [comments]
2022.12.20 20:42 LothernSeaguard Two Years: A Data-Driven Overview of the Subreddit
submitted by LothernSeaguard to DreamWasTaken2 [link] [comments]
Introduction:In case you all haven’t noticed, the subreddit reached its 2 year anniversary last week, and I thought it would be a good time to reflect on how dreamwastaken2 has changed and grown over the years. However, seeing as I only joined the subreddit in early 2021 and only really was active from late 2021 to summer 2022, I don’t have the personal experience to comment on the entire history of subreddit. As such, I decided to perform an analysis of the subreddit using elementary data science and natural language processing methods.
MethodologyThe dataset was obtained in 2 passes: first, I got all possible posts and comments recorded on the Pushshift API (https://redditsearch.io/) starting from December 12th, 2020 to December 11th, 2022. In addition, I also received data from u/dreamwastakenGPT2 containing a dataset they obtained from training a natural language model based on the subreddit.
However, the Pushshift API often doesn’t update the post metadata, meaning that working solely off of Pushshift wouldn’t cover a lot of comments and modified karma scores. To fix this, I called Reddit’s official API (via PRAW: https://praw.readthedocs.io/en/stable/) over every post scraped by Pushshift to get all new comments and updated karma scores.
Finally, I removed all posts and comments that were either removed by moderators or deleted by users to obtain a total of 418,215 posts and comments from the subreddit to analyze.
I used Python 3 for all data manipulation and figure generation in this post. Specifically, I used matplotlib to display the graphs, pandas for data manipulation, numpy for regression analysis, nltk for text processing and sentiment analysis, and detoxify for toxicity analysis.
Results & Analysis:Subreddit Activity:
Fig 1: Number of Posts and Comments made by day
The first metric I looked at was subreddit activity via how many posts and comments were made every day. There are several activity spikes, namely around June 2021 and mid-October 2022. I am not completely certain what the spike in June 2021 was, but I believe it was the RBGC drama, of which more information can be found here: https://www.reddit.com/DreamWasTaken2/comments/ngtxlg/twitter_nuke_rbgc/. Mid-October 2022 was obviously the grooming allegations (the smaller spike earlier in October is most likely the face reveal).
Fig 2: Average daily Karma over time, with polynomial regression line.
This second figure is the average karma per day, so one point in the scatter plot corresponds to a single day. Since the subreddit has grown significantly, a steadily increasing line should be expected, but that doesn’t appear to be the case. As evidenced by the plot, karma per day seems to fluctuate more significantly compared to early and mid-2021. My best explanation is that users got more fatigued with how often drama was occurring and would only get back on the subreddit for larger instances of drama, hence why there would be on and off days in terms of karma depending on the scale of the drama occurring that day.
Fig 3: Word cloud of all posts of the entire subreddit, stopwords excluded.
Fig 4: Word cloud of all posts of the entire subreddit, 1000 most common words in the English language excluded.
This is more out of fun than any real analysis, but here’s the subreddit’s word cloud. Nothing’s really exceptional here. Obviously, Dream and Twitter are going to be the most used words, and other terms like the beginning of hyperlinks (https), referencing content and video are expected.
Fig 5: Average lexical diversity over time
Fig 6: Median word length of posts and comments over time.
So I actually had a lot of difficulty with analyzing complexity, since I know there’s a common complaint that the subreddit’s discussions have deteriorated in quality. The main issue I encountered is that readability tests (like Flesch-Kincaid, Gunning-Fog, and Dalle-Chall) that typically are used to determine the complexity of text fare really poorly when analyzing posts and comments that this subreddit tends to make. From what I can find, the use of abbreviations and terms messes with dictionary-based readability tests like Dalle-Chall, and the tendency of the subreddit to not use punctuation badly affects tests that use sentence length to determine complexity.
As such, I resorted to simpler metrics, such as the overall length and the share of unique words of a post/comment. However, these have their own share of issues, since length treats an essay the same as a copypasta, and lexical diversity favors shorter posts/comments. As seen above, the second issue can clearly be shown, as the regression line for lexical diversity is almost the exact inverse of the regression line for median word length. Given that, it will likely require a different approach to determine if subreddit content/discussions have really declined in quality.
There’s also the weird anomalies in what looks like early April 2021 that I cannot explain (perhaps someone more knowledgeable of subreddit lore can explain).
As mentioned before, I used a library called detoxify (https://github.com/unitaryai/detoxify) to analyze the toxicity of every post/comment. That being said, the results are not authoritative. While detoxify has scored high on some data science competition datasets, I did not fine-tune the model specifically on the subreddit’s comments, so there may be some biases present. That being said, I think internet toxicity has general similarities across all platforms and contexts, hence why I still decided to use this library to run a toxicity analysis on the subreddit.
Fig. 7: Average sentiment over time. Positive values are more positive sentiments, negative values are more negative sentiments.
While not a direct measure of toxicity, the sentiment is a decent indicator of where trends are going. Obviously, if discussion is more toxic, the sentiment will be more negative, but that is not the case here. In fact, the regression line indicates a slight increase in sentiment over time, although when factoring in error, I would say that there is a negligible difference between the sentiment of December 2020 and December 2022.
Fig 8: The aggregate toxicity score over time generated by detoxify, from a scale of 0 to 1 (1 being the most toxic).
As seen in the figure above, the regression line indicates that toxicity peaked in late 2021/early 2022 before decreasing slightly, although again, the difference is negligible when factoring in error. Of most interest seems to be the outliers, namely in early April 2021 (which contains the most and least toxic days of the subreddit) and summer 2022. I don’t really have good explanations for either, although we can see that the outliers in April 2021 also appear on the previous graphs for word length and lexical diversity.
Fig 9: Toxicity score from threats from a scale of 0 to 1 (1 being the most toxic).
Given that the scores here are a fraction of the total scores, it seems that this subreddit doesn’t have an issue with threats, which aligns with my own personal experience. It does seem that 2022 brought quite a few more days with an increased threat score, although I’m more interested as to why the scores are more “clustered together” (for a lack of a better term) around summer 2021. Like before, the regression line indicates that toxicity peaked in late 2021/early 2022 before decreasing slightly, although the difference is negligible (0.005 to 0.004).
Fig 10: Toxicity score from severe instances of toxicity from a scale of 0 to 1 (1 being the most toxic).
So severe toxicity appears to be if a post/comment is very toxic or not, and as this graph shows, it is very much not an issue. The few outliers appear to correlate with general spikes in toxicity.
Fig 11: Toxicity score from sexually explicit instances of toxicity from a scale of 0 to 1 (1 being the most toxic).
The scatter plot looks similar to the plot of general toxicity, albeit with more pronounced outliers (again, early April 2021 stands out). Not much to say.
Fig 12: Toxicity score from identity attacks (racism, discrimination, slurs, etc.) on a scale of 0 to 1
The outliers are less pronounced, which to me indicates that people do not resort to identity attacks even more during heated moments, but the larger amount of outliers is somewhat concerning. However, one thing that may confuse the model is when people are talking about the usage of identity attacks in good faith (e.g. not using them themselves), so this may correlate with instances of drama regarding identity attacks, such as a content creator using slurs.
Fig 12: Toxicity scores from swearing/obscenities from a scale of 0 to 1 (1 being the most toxic).
These values are much higher than any other subcategory of toxicity, which indicates that swearing is a major source of toxicity on this subreddit. The trends on the scatterplot correlate with the scatterplot of all toxicity, although I don’t know what caused the spike at the very beginning of the subreddit. Maybe heated discussion over the speedrunning dream before the moderators reigned it in?
Fig 13: Toxicity scores from personal insults from a scale of 0 to 1 (1 being the most toxic)
Like with obscenities, these values are much larger than other subcategories, so we can reasonably conclude that personal insults and swearing are the largest sources of toxicity on the subreddit. Again, the scatter plot follows the general trend of total toxicity, with the major outliers being a spike in early April 2021 and late summer 2022.
Looking at all these in aggregate, the points of interest appear to be early April 2021, late summer 2022 for high instances of toxicity, and then summer 2021 for less variation by day in toxicity. I don’t really have good explanations for any of these, but I would be welcome to hearing theories explaining these oddities.
Interestingly enough, the recent drama regarding the allegations against Dream does not noticeably appear on the scatterplot of toxicity. Personally, my guess would be that toxicity doesn’t actually map to increased periods of drama, but rather periods where there was disagreement in the subreddit. It appears to me that this subreddit formed a general consensus pretty quickly regarding the allegations, so there was less heated debate and thus toxicity during that time period.
Looking at general trends, there doesn’t seem to be any noticeable change over time, when not considering the outliers. It looks like there is a slight decrease in toxicity starting from late 2021, but it’s negligible.
Conclusion:To be honest, I came into this data analysis with preconceived notions about the state of the subreddit. I was pleasantly surprised when the data did not bear out my hypotheses, since it seems that the subreddit really has not significantly changed based on the metrics I looked at. Only activity and the discussion subjects have really changed, while the regression lines for toxicity and complexity metrics don’t appear to have significantly changed.
That being said, this analysis is not perfect. Human language is an incredibly difficult thing to quantify, and there are researchers with much better tools and knowledge than me that are still grappling with how to quantify language objectively.
In this particular analysis, there were several metrics I wanted to find out, but I couldn’t implement. For instance, the readability tests were not accurate to the subreddit due to some unique quirks. I also wanted to plot the subreddit’s opinion of Dream over time, but I could not find a good entity based sentiment analysis library to analyze that.
With all that out of the way, I still think that the subreddit has changed, just not in ways that are quantifiable by a machine (yet). Ultimately, this is the last major thing I want to contribute to this community after a year of posting and commenting (and another 6 months of lurking), as I just don’t think the changed community is right for me. I generally have been distancing myself from MCYT after the death of Technoblade, and I really don’t feel I have the experience, qualifications, or information to weigh in on something as serious as the allegations against Dream. There were also some posts/comments that just left a bad taste in my mouth (justified or not).
In a way, this is my farewell to the subreddit. It’s been a fun ride, and I made some of my best and highest effort posts here.
To quote Truman Burbank, “In case I don’t see ya’, good afternoon, good evening, and good night!”
2022.12.11 14:55 ihBOO 😼
|submitted by ihBOO to DreamWasTaken2 [link] [comments]|
2022.12.11 13:58 W1ps_ I thought I spent too much time here
Turns out I'm a grass toucher, really surprised me theresubmitted by W1ps_ to DreamWasTaken2 [link] [comments]
2022.12.11 08:12 neddy470v1 Nearly 1000 hours
|submitted by neddy470v1 to DreamWasTaken2 [link] [comments]|
2022.12.10 18:08 DJH-SPAWN-123 Damn
|submitted by DJH-SPAWN-123 to DreamWasTaken2 [link] [comments]|
2022.12.10 16:19 mothussy I know these recaps are annoying but I am ashamed of myself
|submitted by mothussy to DreamWasTaken2 [link] [comments]|
2022.12.09 23:42 Artistic_Astro_57 😗
|submitted by Artistic_Astro_57 to DreamWasTaken2 [link] [comments]|
2022.12.09 23:09 CrazyUmbreonGirl Child’s Play
|submitted by CrazyUmbreonGirl to DreamWasTaken2 [link] [comments]|
2022.12.09 20:33 amiinnt I guess dream won
|submitted by amiinnt to DreamWasTaken2 [link] [comments]|
2022.12.09 14:22 amiinnt I guess dream won
|submitted by amiinnt to DreamWasTaken [link] [comments]|
2022.12.09 07:59 klorambusiili *technoblade bruuuh*
|submitted by klorambusiili to DreamWasTaken2 [link] [comments]|
2022.11.12 01:49 NicoTheSerperior More NFT garbage being peddled by Mattel , of all brands.
|submitted by NicoTheSerperior to shittymobilegameads [link] [comments]|
2022.10.08 23:52 MicroplasticEater Pff
|submitted by MicroplasticEater to DreamWasTaken2 [link] [comments]|
2022.09.30 17:44 klorambusiili Dream better make his post face reveal reddit post here instead of the normie sub because he's been more active here 😤
|submitted by klorambusiili to DreamWasTaken2 [link] [comments]|
2022.09.21 21:09 OneOfTheOlympians days since r/DreamWasTaken2 was dragged through the mud
2022.08.27 20:16 Dim0ndDragon15 The controversy is getting meta
|submitted by Dim0ndDragon15 to DreamWasTaken2 [link] [comments]|
2022.08.03 21:31 TrendingBot [Mildly Trending] /r/DreamWasTaken2 - DreamWasTaken2 (+201 subscribers today; 1,099% trend score)
submitted by TrendingBot to TrendingReddits [link] [comments]
2022.07.24 06:07 theultrasheeplord Reminder that discussing community drama breaks rule 4 and attacking another platform or community's breaks rule 6