Page 1 of 1

stealing images from IMDB

Posted: November 8th, 2008, 23:12
by ProfHawking
Im trying to write a movie archive scripty thing in php.
i have got it pulling lots of details from imdb when given a film title, however getting the poster images is proving fiddly.

This works, kinda:

Code: Select all

$this->poster		= trim($this->getMatch('|<div class="photo">(.*)</div>|Uis', $imdb_content));
for example, it returns:

Code: Select all

<a name="poster" href="/rg/action-box-title/primary-photo/media/rm2762448640/tt0092067" title="Tenkû no shiro Rapyuta"><img alt="Tenkû no shiro Rapyuta" title="Tenkû no shiro Rapyuta" src="http://ia.media-imdb.com/images/M/MV5BMTU4MTUyMTc3MV5BMl5BanBnXkFtZTYwOTg4Mzk5._V1._SX98_SY140_.jpg" border="0"></a>
on sky blue.

All i want it to return is the image source:

Code: Select all

http://ia.media-imdb.com/images/M/MV5BMTU4MTUyMTc3MV5BMl5BanBnXkFtZTYwOTg4Mzk5._V1._SX98_SY140_.jpg

Anyone know how i can get it to do this?

Posted: November 8th, 2008, 23:28
by buzzmong
I'd just go with some string manipulation on the outputted string narf, shouldn't be too much effort to get php to find the src=" in the string and return it to either overwrite the var or assign to a new one. The explode function might come in useful.

I've not done php since last year, and even then I was mostly doing DB calls and some basic formatting.

Posted: November 8th, 2008, 23:30
by Stoat
You can just Regex the source out of it can't you? It starts with 'http://ia.media-imdb.com/images/' and ends with '.jpg' so extracting the url should be easy enough with preg_match.

Posted: November 9th, 2008, 0:13
by Fear
What stoat said. :above:

Code: Select all

http://ia\.media-imdb\.com/images/.+?\.(?:jpg|gif|jpeg|png)
Something like that perhaps.