Hi! I have a toy dataset of posts and comments of a site. I want to model (and predict) the number of posts.
About the data:
- The number of posts has a very strong weekly seasonality (the number of posts over weekends is twice lower than on weekdays), monthly seasonality (with a low season during summer) and a trend.
- A post can be deleted (like spam) or not deleted. It can have a score (number from -100 to +100) and it can have some comments.
From the data I see that:
- If a post is deleted, the author is less likely to create another post.
- If a post is negatively scored, the author is less likely to post again.
- If a post has comments, the author is more likely to post again.
I want to model the number of posts and predict it for the next day / month / year. I used Prophet and it worked very well. Now I am investigating if and how to create a model to look at what-if scenarios. For example, how the number of posts will look like in the long run if users of the site become more active in comments or if there is a group of users who start downvoting / upvoting all posts.
In the case of the deleted / not deleted state I can present the total number of posts as a sum of two models: deleted posts and not deleted posts and use Prophet for each independently. But it gets tricky when I try to think of how to model deleted / not deleted plus score and comments.
I would appreciate it if you can suggest a way to model this use case and / or link where I can read more about possible approaches.