Lance Levsen wrote:
> The program pulls the whole feed (kind of has to) but just uses the most
> recent post. I compare the timestamp (unix epoch) to the last posts
> timestamp in the postgres db for that blog.
Not really, just use an HTTP HEAD request and check if the MTIME is
greater than the last pulled feed. HTTP totally has what you need for
making updates cost a lot closer to the data required to pull only
updated feeds, instead of every time you look. Pulling the full RSS
everytime you check it is pretty lame in terms of resource usage, and it
won't scale :p
> My question is: if I thread _just_ the I/O collecting the feeds, will
> that burn my database connections when I insert a new post, or will the
> threads deal with it nicely? Or is this even a relevant question w/out code?
PGSQL does row-level locking, instead of table level locking. This
means that if you've got an app sticking in some new rows, those rows
will be hidden from another thread doing a read from the same table
(rather than blocking the entire table for the time of the transaction +
the locking overhead). You could have 100 threads sticking in 100 rows,
and the other 900 rows would be easily accesible to your select statement.
So, no, there's no big deal there.
Received on Fri May 26 20:08:54 2006
This archive was generated by hypermail 2.1.8 : Fri Sep 08 2006 - 23:26:38 CST