Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should be possible update existing model instance #23

Open
craiga opened this issue Jul 2, 2019 · 2 comments
Open

Should be possible update existing model instance #23

craiga opened this issue Jul 2, 2019 · 2 comments

Comments

@craiga
Copy link

craiga commented Jul 2, 2019

I'm scraping a listing of events and storing them in an Event model. This model inherits from model_utils.models.TimeStampedModel which gives auto-populated created and updated fields, which I find useful.

Unfortunately, there's no way for me to reuse an existing, populated instance of Event in my pipeline without manually copying each field in the DjangoItem to my model, so created is being updated every time.

There should be a way to do this, perhaps by allowing setting of .instance to my model instance then moving attribute setting into .save().?

@craiga
Copy link
Author

craiga commented Jul 2, 2019

FWIW, My pipeline looks like this:

class EventDjangoPipeline(object):
    def process_item(self, item, spider):
        try:
            event = models.Event.objects.get(url=item["url"])
            event_id = event.id
            event = item.save(commit=False)
            event.id = event_id

        except models.Event.DoesNotExist:
            pass

        item.save()
        return item

Ideally, I'd like to be able to do something like:

class EventDjangoPipeline(object):
    def process_item(self, item, spider):
        try:
            event = models.Event.objects.get(url=item["url"])
            item.instance = event

        except models.Event.DoesNotExist:
            pass

        item.save()
        return item

@craiga
Copy link
Author

craiga commented Jul 2, 2019

Workaround is to set created in the pipeline:

class EventDjangoPipeline(object):
    def process_item(self, item, spider):
        try:
            event = models.Event.objects.get(url=item["url"])
            item["created"] = event.created
            event_id = event.id
            event = item.save(commit=False)
            event.id = event_id

        except models.Event.DoesNotExist:
            pass

        item.save()
        return item

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant