You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Firefox and its forks use the <HR> tag to introduce a line separator between bookmarks and this prevent the parser from working correctly. Chrome and its forks do not introduce such a tag as far as I can tell.
A possible solution could be to remove each <HR> tag before parsing e.g. in parser.py create the following function:
# remember to import redef__remove_hr_tags(html_lines):
# Compile the regex pattern for matching <HR> tags (case-insensitive)hr_pattern=re.compile(r'<hr[^>]*>', re.IGNORECASE)
# Process each linecleaned_lines= []
forlineinhtml_lines:
# Remove <HR> tags from the linecleaned_line=hr_pattern.sub('', line)
cleaned_lines.append(cleaned_line)
returncleaned_lines
And then, in parse() function:
defparse(netscape_bookmarks_file: NetscapeBookmarksFile):
""" Responsible to start parsing, getting metadata information and start the folder recursion :param netscape_bookmarks_file: a NetscapeBookMarkFile :return: the NetscapeBookMarkFile, but parsed """line_num=0file=netscape_bookmarks_filelines=netscape_bookmarks_file.html.splitlines()
# Remove the <HR> taglines=__remove_hr_tags(lines)
# rest of the code...
The text was updated successfully, but these errors were encountered:
Firefox and its forks use the
<HR>
tag to introduce a line separator between bookmarks and this prevent the parser from working correctly. Chrome and its forks do not introduce such a tag as far as I can tell.A possible solution could be to remove each
<HR>
tag before parsing e.g. in parser.py create the following function:And then, in parse() function:
The text was updated successfully, but these errors were encountered: