So, here is DOUBLE volume V2022B for interval 01.2021-05.2022 in series of composite safebooru-based rips
11.2021 - 01.2022 volume V2022A
08.2021 - 11.2021 volume V2021D
06.2021 - 08.2021 volume V2021C
03.2021 - 06.2021 volume V2021B
12.2020 - 03.2021 volume V2021A
09.2020 - 12.2020 volume V2020D
06.2020 - 09.2020 volume V2020C
02.2020 - 05.2020 volume V2020B
08.2019 - 01.2020 volume V2020A
11.2018 - 08.2019 volume V2019
aimed to feed BOORU-CHARS OPEN DATASET 2021 and 2022
This rips not intended to be “complete and maximum quality” but rather “representative the best of” to help users
not to loose interesting fandom, artist or even single prominent picture and get all stuff with several clicks.
Another reason to build this megalythe is neural network training over art images. There are promising results, stay tuned.
Sources used (priorities high to low when deduplicating):
218.022 images sorted and zipped according aspect ratio (dimensions 2 folders) priorities high to low :
and also for source and (sometimes) ID range, mentioned in folder/archive name.
You can browse pictures directly in archives with FastStone MaxView of something like it.
File names structure : %website% - %id% - %up_to_3_copyrights% ~ %up_to_5_characters% (%up_to_2_artists%).%ext% where
so you can extract subsets of interest with xcopy (from already unzipped images) or unzipping (from release on the fly) e.g.
for %%F in ("d:\Safebooru 2022b\*.zip") do 7z x -r -o"e:\sortarea\" "%%F" *sono*bisque*doll*
xcopy /s d:\Safebooru 2022b\*sono*bisque*doll* e:\sortarea
Transformations and filters:
Some meta-information included in tab delimited files :
Using some database you can play with SQL and xcopy (from already unzipped images, copypasting query result) anything you want, e.g.
select 'xcopy "d:\'||torr_path||'\'||file_name||'" e:\sortarea ' xc
from files f
join tags t on t.booru=f.booru and t.fid=f.fid
where t.tag like 'ukrain%' -- support our fight for freedom any way you can
NOTE-1: several hundreds too-NSFW images excluded from ZIPs during last-minute patch, that’s not reflected in TSV metadata
Comments - 1
SomaHeir
Glad you’re still doing this :)