Fix archive plugins for НЭБ and Alib; add network integration tests
- html_scraper: add img_alt strategy (НЭБ titles from <img alt>), bold_text strategy (Alib entries from <p><b>), Windows-1251 encoding support, _cls_inner_texts() helper that strips inner HTML tags - rsl: rewrite to POST SearchFilterForm[search] with CSRF token and CQL title:(words) AND author:(word) query format - config: update rusneb (img_alt + correct author_class) and alib_web (encoding + bold_text) to match fixed plugin strategies - tests: add tests/test_archives.py with network-marked tests for all six archive plugins; НЛР and ШПИЛ marked xfail (endpoints return HTTP 404) - presubmit: exclude network tests from default run (-m "not network") Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -64,8 +64,8 @@ functions:
|
||||
config:
|
||||
url: "https://rusneb.ru/search/"
|
||||
search_param: q
|
||||
title_class: "title"
|
||||
author_class: "author"
|
||||
img_alt: true
|
||||
author_class: "search-list__item_subtext"
|
||||
|
||||
alib_web:
|
||||
name: "Alib (web)"
|
||||
@@ -77,8 +77,8 @@ functions:
|
||||
url: "https://www.alib.ru/find3.php4"
|
||||
search_param: tfind
|
||||
extra_params: {f: "5", s: "0"}
|
||||
link_href_pattern: "t[a-z]+\\.phtml"
|
||||
author_class: "aut"
|
||||
encoding: "cp1251"
|
||||
bold_text: true
|
||||
|
||||
nlr:
|
||||
name: "НЛР"
|
||||
|
||||
Reference in New Issue
Block a user