Tech Support Posted May 12, 2017 Share Posted May 12, 2017 KVS provides API to use youtube-dl server library for scrapping videos from other tube sites. You can implement your own grabber class in PHP language and upload it into KVS. Here is how this can be done. The example features fully working custom youtube grabber (KVS has built-in grabber for youtube by the way). NOTE: it is not strictly required to use youtube-dl API, it is also possible to create a completely custom grabber with your own code. Implementing grabber class using youtube-dl API Create CustomGrabberYoutube.php with the following code (also attached here as a text file): <?php // when you change classname, change it at the very bottom as well in this line: // $grabber = new CustomGrabberYoutube(); class CustomGrabberYoutube extends KvsGrabberVideoYDL { // =============================================================================================================== // infrastructure methods // =============================================================================================================== public function get_grabber_id() { //prefix your grabber ID with "custom_" return "custom_videos_youtube"; } public function get_grabber_name() { // name displayed in admin panel return "youtube.com"; } public function get_grabber_version() { // this is required for grabbers that are autoupdated from KVS return "1"; } public function get_grabber_domain() { // domain name, KVS will check this to find out if this grabber is suitable for the given URL return "youtube.com"; } public function get_supported_url_patterns() { // returns list of regexp patterns that describe video URLs, for youtube this pattern will match // https://www.youtube.com/watch?v=htOroIbxiFY return array("/https?:\/\/(www\.)?youtube\.com\/watch.*/i"); } public function can_grab_description() { // return true if your grabber is going to provide description for each video return false; } public function can_grab_categories() { // return true if your grabber is going to provide categories for each video return false; } public function can_grab_tags() { // return true if your grabber is going to provide tags for each video return false; } public function can_grab_models() { // return true if your grabber is going to provide models for each video return false; } public function can_grab_content_source() { // return true if your grabber is going to provide content source for each video return false; } public function can_grab_date() { // return true if your grabber is going to provide date for each video return false; } public function can_grab_rating() { // return true if your grabber is going to provide rating for each video return false; } public function can_grab_views() { // return true if your grabber is going to provide views for each video return false; } public function can_grab_video_files() { // this should be true for youtube-dl return true; } public function get_supported_qualities() { // list of supported video qualities, should match what youtube-dl returns in its info under formats // run this command: // youtube-dl --dump-json https://www.youtube.com/watch?v=PhDXRCLsqz4 >> test.json // and open test.json in Firefox, find "formats" array and look into the available formats // youtube has too many formats, KVS only supports formats with "ext"="mp4" // you can list them here and you will be able to select from them in grabber settings return array('360p', '720p'); } public function get_downloadable_video_format() { // for youtube-dl grabber KVS only supports mp4 formats return 'mp4'; } public function can_grab_lists() { // return true if you want to allow this grabber to grab lists and thus be used on autopilot // if true, you will also need to implement grab_list() method - see below return false; } // =============================================================================================================== // parsing methods - modify if you need to parse lists or add additional info // =============================================================================================================== public function grab_list($list_url, $limit) { // this method is used to grab lists of videos from the given list URL // $limit parameter means the number of videos to grab (including pagination) // if $limit == 0, then you just need to find all videos on the given URL, no need to care about pagination $result = new KvsGrabberListResult(); // $page_content here is the HTML code of the given page $page_content = $this->load_page($list_url); // parse $page_content and add all video URLs to the result // consider pagination if needed // you can use $this->load_page($list_url) method to get HTML from any URL $result->add_content_page("https://youtube.com/video1"); $result->add_content_page("https://youtube.com/video2"); $result->add_content_page("https://youtube.com/video3"); return $result; } protected function grab_video_data_impl($page_url, $tmp_dir) { // by default the base class will populate these fields (if provided by youtube-dl): // - title // - MP4 video files for the qualities listed in get_supported_qualities() function // - description (should be enabled in can_grab_description() function) // - date (should be enabled in can_grab_date() function) // - tags (should be enabled in can_grab_tags() function) // - categories (should be enabled in can_grab_categories() function) $result = parent::grab_video_data_impl($page_url, $tmp_dir); if ($result->get_error_code() > 0) { return $result; } // do any custom grabbing here for additional fields, which are not supported by youtube-dl // $page_content here is the HTML code of the given video page //$page_content = $this->load_page($page_url); // parse HTML code and set additional data into $result, e.g. data which is not provided by youtube-dl //$result->set_rating(85); //$result->set_votes(10); //$result->set_views(123874); //$result->set_content_source("Content Source Name"); //$result->add_model("Model 1"); //$result->add_model("Model 2"); return $result; } } $grabber = new CustomGrabberYoutube(); KvsGrabberFactory::register_grabber_class(get_class($grabber)); return $grabber; The code has comments where needed. Basically youtube-dl provides main video info, such as title, description, tags, categories, date and files. If this is enough for you, you should only modify set of methods on top grouped under infrastructure methods section. These methods are designed to integrate grabber into KVS, so you should change them as described. You should also modify grabber class name in 2 places (top and bottom) and make sure that grabber class name is unique and has Custom in its name (to avoid intersections with any future grabbers we will add). If you want to implement parsing lists or add additional info, you should modify parsing methods as explained in the code. Implementing grabber class without youtube-dl Here is example grabber class that is not using youtube-dl. Put your custom parsing logic: <?php // when you change classname, change it at the very bottom as well in this line: // $grabber = new CustomGrabberYoutube(); class CustomGrabberYoutube extends KvsGrabberVideo { // =============================================================================================================== // infrastructure methods // =============================================================================================================== public function get_grabber_id() { //prefix your grabber ID with "custom_" return "custom_videos_youtube"; } public function get_grabber_name() { // name displayed in admin panel return "youtube.com"; } public function get_grabber_version() { // this is required for grabbers that are autoupdated from KVS return "1"; } public function get_grabber_domain() { // domain name, KVS will check this to find out if this grabber is suitable for the given URL return "youtube.com"; } public function get_supported_url_patterns() { // returns list of regexp patterns that describe video URLs, for youtube this pattern will match // https://www.youtube.com/watch?v=htOroIbxiFY return array("/https?:\/\/(www\.)?youtube\.com\/watch.*/i"); } public function can_grab_description() { // return true if your grabber is going to provide description for each video return true; } public function can_grab_categories() { // return true if your grabber is going to provide categories for each video return true; } public function can_grab_tags() { // return true if your grabber is going to provide tags for each video return true; } public function can_grab_models() { // return true if your grabber is going to provide models for each video return true; } public function can_grab_content_source() { // return true if your grabber is going to provide content source for each video return true; } public function can_grab_date() { // return true if your grabber is going to provide date for each video return true; } public function can_grab_rating() { // return true if your grabber is going to provide rating for each video return true; } public function can_grab_views() { // return true if your grabber is going to provide views for each video return true; } public function can_grab_video_files() { // return true if your grabber is going to provide video files for each video return true; } public function can_grab_video_embed() { // return true if your grabber is going to provide embed code for each video return true; } public function can_grab_video_duration() { // return true if your grabber is going to provide duration for each video return true; } public function can_grab_video_screenshot() { // return true if your grabber is going to provide screenshot for each video return true; } public function get_supported_qualities() { // list of supported video qualities that your grabber provides return array('360p', '720p'); } public function get_downloadable_video_format() { // only grabbers that return MP4 files are supported return 'mp4'; } public function can_grab_lists() { // return true if you want to allow this grabber to grab lists and thus be used on autopilot // if true, you will also need to implement grab_list() method - see below return false; } // =============================================================================================================== // parsing methods // =============================================================================================================== public function grab_list($list_url, $limit) { // this method is used to grab lists of videos from the given list URL // $limit parameter means the number of videos to grab (including pagination) // if $limit == 0, then you just need to find all videos on the given URL, no need to care about pagination $result = new KvsGrabberListResult(); // $page_content here is the HTML code of the given page $page_content = $this->load_page($list_url); // parse $page_content and add all video URLs to the result // consider pagination if needed // you can use $this->load_page($list_url) method to get HTML from any URL $result->add_content_page("https://youtube.com/video1"); $result->add_content_page("https://youtube.com/video2"); $result->add_content_page("https://youtube.com/video3"); return $result; } protected function grab_video_data_impl($page_url, $tmp_dir) { $result = new KvsGrabberVideoInfo(); // $page_code here is the HTML code of the given video page $page_code = $this->load_page($page_url); if (!$page_code) { $result->log_error(KvsGrabberVideoInfo::ERROR_CODE_PAGE_UNAVAILABLE, "Page can't be loaded: $page_url"); return $result; } // parse HTML code and set data into $result // replace with your parsing logic $result->set_canonical($page_url); $result->set_title("Demo title"); $result->set_description("Demo description long description long description long description long description."); $result->set_screenshot("http://www.localhost.com/test/test.jpg"); $result->set_duration(30); $result->set_date(time()); $result->set_views(1526); $result->set_rating(87); $result->set_votes(11); $result->set_embed("<div>embed code</div>"); $result->add_category("Category 1"); $result->add_category("Category 2"); $result->add_category("Category 3"); $result->add_tag("Tag 1"); $result->add_tag("Tag 2"); $result->add_tag("Tag 3"); $result->add_model("Model 1"); $result->add_model("Model 2"); $result->add_model("Model 3"); $result->set_content_source("Content Source 1"); $result->add_video_file("360p", "http://www.localhost.com/test/test_360p.mp4"); $result->add_video_file("720p", "http://www.localhost.com/test/test_720p.mp4"); $result->add_custom_field(1, "Custom1"); $result->add_custom_field(3, "Custom3"); return $result; } } $grabber = new CustomGrabberYoutube(); KvsGrabberFactory::register_grabber_class(get_class($grabber)); return $grabber; Testing grabber class Put grabber class file to your project root folder. Also create test_grabber.php file in the same folder with the following code: <?php header('Content-Type: text/plain; charset=utf8'); ini_set('display_errors', 1); error_reporting(E_ERROR | E_PARSE | E_COMPILE_ERROR); require_once('admin/plugins/grabbers/classes/KvsGrabber.php'); $grabber = require_once('CustomGrabberYoutube.php'); $grabber->init(new KvsGrabberSettings(), ""); if ($grabber instanceof KvsGrabberVideoYDL) { $grabber->set_ydl_binary('/usr/local/bin/youtube-dl'); } print_r($grabber->grab_video_data('https://www.youtube.com/watch?v=htOroIbxiFY', 'tmp')); Modify this code to your class name and specify your demo URL. Then run via browser: http://domain.com/test_grabber.php If everything is fine, you should see dumped info from the scrapped video. Installing grabber into KVS Just go to Plugins -> Grabbers in admin panel and upload your grabber class into Custom grabber field. Then after saving the form you will see your grabber installed marked with red color. You need to open this grabber settings and select Content mode = Download. Also enable the needed fields under Data. NOTE: If you don't see any fields under Data, then your grabber class doesn't return true from can_grab_xxx() methods. If you want to update grabber class, simply upload it again. It is recommended to increment version in get_grabber_version() method to stay sure on which version KVS is using. Finding the list of supported video files to grab If you don't know which formats source site provides (usually a subset of: 240p, 360p, 480p, 720p, 1080p), you can check that from youtube-dl: youtube-dl --dump-json https://www.youtube.com/watch?v=PhDXRCLsqz4 >> test.json This should generate test.json file which can be open in firefox to show JSON structure. Find a node called formats, it should be a list with items describing each supported format. KVS can only import formats with ext = mp4, you can list them in get_supported_qualities() method using XXXp notation, e.g. 360p, 720p. Here is sample screenshot for youtube: CustomGrabberYoutube.txt Quote Link to comment Share on other sites More sharing options...
xvids Posted May 12, 2017 Share Posted May 12, 2017 Hi, Thanks, But how to install youtube-dl on debian Youtube-dl library is not found (https://github.com/rg3/youtube-dl) by the following command or something else // path sudo curl -L https://yt-dl.org/downloads/latest/youtube-dl -o /usr/local/bin/youtube-dl sudo chmod a+rx /usr/local/bin/youtube-dl Quote Link to comment Share on other sites More sharing options...
Tech Support Posted May 12, 2017 Author Share Posted May 12, 2017 Hi, Thanks, But how to install youtube-dl on debian I think youtube-dl is a public library and it has its own installation instructions, no need to duplicate it here since it may change with new releases. Quote Link to comment Share on other sites More sharing options...
Rick Posted October 10, 2017 Share Posted October 10, 2017 Will you be able to post an example for the album grabber as well? Thanks! I think the shipped code for it is encrypted like everything else, last I checked. Quote Link to comment Share on other sites More sharing options...
Tech Support Posted October 11, 2017 Author Share Posted October 11, 2017 Here is sample code for album grabber: <?php class KvsGrabberAlbumCustomSample extends KvsGrabberAlbum { public function get_grabber_id() { return "albums_custom_sample"; } public function get_grabber_name() { return "Sample custom grabber"; } public function get_grabber_version() { return "1"; } public function get_grabber_domain() { return "domain1.com"; } public function get_supported_url_patterns() { return array("/https?:\/\/(www\.)?domain1\.com\/.*/i"); } public function can_grab_description() { return true; } public function can_grab_categories() { return true; } public function can_grab_tags() { return true; } public function can_grab_models() { return true; } public function can_grab_content_source() { return true; } public function can_grab_rating() { return true; } public function can_grab_views() { return true; } public function can_grab_date() { return true; } public function can_grab_lists() { return true; } public function grab_list($list_url, $limit) { $result = new KvsGrabberListResult(); $result->add_content_page("http://domain1.com/album1/"); $result->add_content_page("http://domain1.com/album2/"); return $result; } protected function grab_album_data_impl($page_url, $tmp_dir) { $result = new KvsGrabberAlbumInfo(); $page_code = $this->load_page($page_url); if (!$page_code) { $result->log_error(KvsGrabberAlbumInfo::ERROR_CODE_PAGE_UNAVAILABLE, "Page can't be loaded: $page_url"); return $result; } $result->set_canonical($page_url); $result->set_title("Demo title"); $result->set_description("Demo description long description long description long description long description."); $result->set_date(time()); $result->set_views(1526); $result->set_rating(87); //0-100% $result->set_votes(11); $result->add_category("Category 1"); $result->add_category("Category 2"); $result->add_category("Category 3"); $result->add_tag("tag 1"); $result->add_tag("tag 2"); $result->add_tag("tag 3"); $result->add_model("Model 1"); $result->add_model("Model 2"); $result->add_model("Model 3"); $result->set_content_source("Content Source 1"); $result->add_image_file("http://www.domain1.com/test/test.jpg?v=1"); $result->add_image_file("http://www.domain1.com/test/test.jpg?v=2"); return $result; } } $grabber = new KvsGrabberAlbumCustomSample(); KvsGrabberFactory::register_grabber_class(get_class($grabber)); return $grabber; Quote Link to comment Share on other sites More sharing options...
vqporn Posted March 31, 2018 Share Posted March 31, 2018 Fatal error: Call to a member function is_import_categories_as_tags() on a non-object in /home/admin/web/xxxxxxxx/public_html/admin/plugins/grabbers/classes/KvsGrabber.php on line 2842 Quote Link to comment Share on other sites More sharing options...
vqporn Posted March 31, 2018 Share Posted March 31, 2018 Hello as I do so that by url I detect all the vidos in a url and all the albums in a url Quote Link to comment Share on other sites More sharing options...
Tech Support Posted April 2, 2018 Author Share Posted April 2, 2018 Fatal error: Call to a member function is_import_categories_as_tags() on a non-object in /home/admin/web/xxxxxxxx/public_html/admin/plugins/grabbers/classes/KvsGrabber.php on line 2842 We updated test code in the original post for this issue. The new grabber API has things coded differently. Hello as I do so that by url I detect all the vidos in a url and all the albums in a url Content URLs on the page should be detected automatically based on what you provide in this function: public function get_supported_url_patterns() { return array("/regexp here/i"); } Quote Link to comment Share on other sites More sharing options...
Sterx Posted February 1, 2022 Share Posted February 1, 2022 Hello I added my class 'CustomGrabberRedporn' into KVS (KVS v5.5.0) and activated him. Also, i uploaded test_grabber.php to my servser and runned him. I see for the next error: <br /> <b>Fatal error</b>: require_once(): Failed opening required 'CustomGrabberRedporn.php' (include_path='.:/usr/share/php') in <b>.......cc/test_grabber.php</b> on line <b>9</b><br /> Can you help me? Quote Link to comment Share on other sites More sharing options...
Tech Support Posted February 2, 2022 Author Share Posted February 2, 2022 For testing purposes you don't need to upload grabber into KVS plugin. You need to put your CustomGrabberRedporn.php file next to test_grabber.php file in the same directory. Quote Link to comment Share on other sites More sharing options...
AsianViralHub Posted yesterday at 03:16 AM Share Posted yesterday at 03:16 AM is it possible to download different video links of the same page with titles like title1, title2, title 3? Quote Link to comment Share on other sites More sharing options...
Tech Support Posted yesterday at 02:47 PM Author Share Posted yesterday at 02:47 PM Grabbers in KVS have 2 ways of parsing: 1) Grab individual video URL (using grab_video_data_impl($page_url, $tmp_dir) function in grabber PHP code). 2) Grab list URLs, which produces list of URLs passed into function from #1 (using grab_list($list_url, $limit) function in grabber PHP code). When you submit a URL to grabber, KVS will check if this type of URL is individual URL, or not. If URL is considered as individual, the URL is passed to grab_video_data_impl() function for grabbing video details from it, otherwise the URL is considered as list URL and will be passed to grab_list() function to get list of individual URLs from it. The detection is based on whether the provided URL matches one of the regexps returned from function get_supported_url_patterns(). So you should code accordingly. For example you can define individual video URL pattern only if it has some #hash at the end. For example the video page is: https://www.kvs-demo.com/videos/69/300-spartans/ And the detected videos on this page are these: https://www.kvs-demo.com/videos/69/300-spartans/#video1 https://www.kvs-demo.com/videos/69/300-spartans/#video2 Then you should code get_supported_url_patterns() function so that it returns pattern with #videoN at the end, and in this case the first URL will be passed to grab_list() function. This function should parse it and return 2 sub-urls with #video1 and #video2 at the end. Finally these sub-urls will be passed to grab_video_data_impl() function, which does the actual parsing and returns video details. Based on the #hash passed in the URL you can guess whether it should return title 1 or title 2. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.