Adam
Adam

Reputation: 339

Django project architecture advice

I have a django project and I have a Post model witch look like that:

class BasicPost(models.Model):
    author = models.ForeignKey('auth.User', on_delete=models.CASCADE)
    published = models.BooleanField(default=False)
    created_date = models.DateTimeField(auto_now_add=True)
    title = models.CharField(max_length=100, blank=False)
    body = models.TextField(max_length=999)
    media = models.ImageField(blank=True)

    def get_absolute_url(self):
        return reverse('basic_post', args=[str(self.pk)])

    def __str__(self):
        return self.title

Also, I use the basic User model that comes with the basic django app.

I want to save witch posts each user has read so I can send him posts he haven't read. My question is what is the best way to do so, If I use Many to Many field, should I put it on the User model and save all the posts he read or should I do it in the other direction, put the Many to Many field in the Post model and save for each post witch user read it? it's going to be more that 1 million + posts in the Post model and about 50,000 users and I want to do the best filters to return unread posts to the user

If I should use the first option, how do I expand the User model? thanks!

Upvotes: 1

Views: 69

Answers (3)

LaurentP13
LaurentP13

Reputation: 66

On your first question (which way to go): I believe that ManyToMany by default creates indices in the DB for both foreign keys. Therefore, wherever you put the relation, in User or in BasicPost, you'll have the direct and reverse relationships working through an index. Django will create for you a pivot table with three columns like: (id, user_id, basic_post_id). Every access to this table will index through user_id or basic_post_id and check that there's a unique couple (user_id, basic_post_id), if any. So it's more within your application that you'll decide whether you filter from a 1 million set or from a 50k posts.

On your second question (how to overload User), it's generally recommended to subclass User from the very beginning. If that's too late and your project is too far advanced for that, you can do this in your models.py:

class BasicPost(models.Model):
    # your code
    readers = models.ManyToManyField(to='User', related_name="posts_already_read")

# "manually" add method to User class
def _unread_posts(user):
    return BasicPost.objects.exclude(readers__in=user)

User.unread_posts = _unread_posts

Haven't run this code though! Hope this helps.

Upvotes: 1

RHSmith159
RHSmith159

Reputation: 1592

Could you have a separate ReadPost model instead of a potentially large m2m, which you could save when a user reads a post? That way you can just query the ReadPost models to get the data, instead of storing it all in the blog post.

Maybe something like this:

from django.utils import timezone


class UserReadPost(models.Model):
    user = models.ForeignKey("auth.User", on_delete=models.CASCADE, related_name="read_posts")
    seen_at = models.DateTimeField(default=timezone.now)
    post = models.ForeignKey(BasicPost, on_delete=models.CASCADE, related_name="read_by_users")

You could add a unique_together constraint to make sure that only one UserReadPost object is created for each user and post (to make sure you don't count any twice), and use get_or_create() when creating new records.

Then finding the posts a user has read is: posts = UserReadPost.objects.filter(user=current_user).values_list("post", flat=True)

This could also be extended relatively easily. For example, if your BasicPost objects can be edited, you could add an updated_at field to the post. Then you could compare the seen_at of the UserReadPost field to the updated_at field of the BasicPost to check if they've seen the updated version.

Downside is you'd be creating a lot of rows in the DB for this table.

Upvotes: 1

Alex Antonov
Alex Antonov

Reputation: 15146

If you place your posts in chronological order (by created_at, for example), your option could be to extend user model with latest_read_post_id field.

This case:

class BasicPost(models.Model):
    # your code

    def is_read_by(self, user):
        return self.id < user.latest_read_post_id

Upvotes: 0

Related Questions