Commit Graph

35 Commits

Author SHA1 Message Date
iequidoo
e0d123f732 chore(cargo): bump quick-xml from 0.37.5 to 0.38.3 2025-10-09 14:03:56 -03:00
link2xt
5c3de759d3 refactor: upgrade to Rust 2024 2025-06-28 17:07:59 +00:00
link2xt
e5b79bf405 refactor: replace once_cell::sync::Lazy with std::sync::LazyLock 2025-04-04 20:51:37 +00:00
dependabot[bot]
5beb4a5f27 chore(cargo): bump quick-xml from 0.31.0 to 0.35.0
Bumps [quick-xml](https://github.com/tafia/quick-xml) from 0.31.0 to 0.35.0.
- [Release notes](https://github.com/tafia/quick-xml/releases)
- [Changelog](https://github.com/tafia/quick-xml/blob/master/Changelog.md)
- [Commits](https://github.com/tafia/quick-xml/compare/v0.31.0...v0.35.0)

---
updated-dependencies:
- dependency-name: quick-xml
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Co-authored-by: iequidoo <dgreshilov@gmail.com>
2024-07-02 18:52:29 -03:00
bjoern
810be4f6c7 fix: preserve upper-/lowercase of links parsed by dehtml() (#5362)
this PR fixes a bug that lowercases all links handleld by `dehtml()`,
which is wrong.

closes #5361
2024-03-19 16:38:23 +01:00
iequidoo
4f25072352 fix: dehtml: Don't just truncate text when trying to decode (#5223)
If `escaper::decode_html_buf_sloppy()` just truncates the text (which happens when it fails to
html-decode it at some position), then it's probably not HTML at all and should be left as
is. That's what happens with hyperlinks f.e. and there was even a test on this wrong behaviour which
is fixed now. So, now hyperlinks are not truncated in messages and should work as expected.
2024-02-02 14:55:52 -03:00
link2xt
2d30afd212 fix: do not run simplify() on dehtml() output
simplify() is written to process incoming plaintext messages
and extract footers and quotes from them.
Incoming messages contain various quote styles
and simplify() implements heuristics to detects them.

If dehtml() output is processed by simplify(),
simplify() heuristics may erroneously detect
footers and quotes in produced plaintext.

dehtml() should directly detect quotes
instead of converting them to plaintext quotes
for parsing with simplify().
2023-07-02 23:12:13 +00:00
link2xt
5fe94e8bce docs(dehtml): document AddText variants 2023-07-02 23:12:13 +00:00
link2xt
00cb72f04d fix(dehtml): do not insert unnecessary newlines when parsing <p> tags
Previously, parsing of `<p>Foo</p><p>Bar</p>`
resulted in `\n\nFoo\n\n\n\nBar\n\n`.

Now it results in `Foo\n\nBar`.
2023-06-16 16:27:14 +00:00
link2xt
c3f352aff1 fix(dehtml): skip links with empty text 2023-06-14 15:41:38 +00:00
link2xt
fcf73165ed Inline format arguments
This feature has been stable since Rust 1.58.0.
2023-01-30 11:50:11 +03:00
dependabot[bot]
5432e108a1 cargo: bump quick-xml from 0.23.0 to 0.26.0
Bumps [quick-xml](https://github.com/tafia/quick-xml) from 0.23.0 to 0.26.0.
- [Release notes](https://github.com/tafia/quick-xml/releases)
- [Changelog](https://github.com/tafia/quick-xml/blob/master/Changelog.md)
- [Commits](https://github.com/tafia/quick-xml/compare/v0.23.0...v0.26.0)

---
updated-dependencies:
- dependency-name: quick-xml
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-12-27 14:01:33 -03:00
iequidoo
c3a0bb2b77 Fix cargo clippy and doc errors after Rust update to 1.66 2022-12-16 02:46:04 +04:00
Friedel Ziegelmayer
290ee20e63 feat: migrate from async-std to tokio 2022-06-27 14:05:21 +02:00
link2xt
d947479a60 Add SimplifiedText structure
Return structure from `simplify` instead of 5-tuple.
2022-06-12 21:08:32 +00:00
link2xt
24d967d6f4 dehtml: update for quick-xml 0.23 2022-06-01 22:03:43 +00:00
link2xt
30cb0cbcfd Reduce number of AsRef generics
They result in compilation of duplicate code.
2021-12-31 13:57:45 +00:00
link2xt
2b7bf11b05 Rust documentation improvements
Document all public modules and some methods.

Make some internal public symbols private.
2021-08-22 15:34:14 +02:00
B. Petersen
9ecb6d9b15 test dehtml for pre-tag (wrote that little test to test the new coverage script :)[D 2021-04-15 01:49:12 +03:00
link2xt
ac9394cb16 dehtml.rs: test </i> tag 2021-04-15 00:30:50 +03:00
Hocuri
179a2a50e6 Parse <blockquote> tags for better quote detection (#2313) 2021-04-07 18:45:00 +02:00
link2xt
0601b05cb7 Use footer as a contact status 2021-02-11 13:57:49 +03:00
bjoern
e2688f6355 add option to access original message (#2125)
* draft API to deal with uncut message texts

* add column mime_modified

* add mime_modified flag to MimeParser and save it in the database

* save mime_headers also when mime_modified is set

* cargo fmt

* set mime_modified on parsed html-texts and when there are multiple alternative-parts; add test for that

* prototype functions, add to repl and ffi

* use correct mime_modified flag

* basically parse Mime-Structure to HTML

* add basic tests for HTML-parsing

* convert text/plain to html for getting original

* respect charset for plain texts

* make test more specific

* fix handling non-utf-8 charsets for plain messages

* add test for plain_to_html()

* add failing test for plaintext linkify

* linkify urls in plain text

* fix regex

* plain text linkify: add failing test for encapsulated links as <https://domain.com>

* plain text linkify: make encapsulated links as <https://domain.com> work

* plain text linkify: require word boundary at beginning of link, add tests for that

* plain text linkify: linkify emails

* plain text: support format=flowed

* plain text: support quotes

* make clippy happy

* set mime-modified also when simplify() cuts non-html messages, add tests for that

* streamline mime recursion

* repl tool: write original html to file for further processing

* convert cid:- to data:-protocol

* add a test for cid: to data: conversion

* make clippy happy

* fix html-tests to work with windows-lineends

* clarify what the returned html-code may contain

* add some more detailed doc comments

* add mime_modified column only if not exist

this additional check is needed
as the column may added with another dbversion in
some shipped beta-versions.

* incorporate documentation suggestions from review

* rename get_original_mime_html() to more simple get_html()

* rename api is_mime_modified() to more simple has_html(); internally, mime_modified-flag stays as-is, however

* rename MimeS to MimeMultipartType

* do not set mime-modified flag for encrypted messages that need extra-handling for saved mime-structure

* fix typo

* move get_msg_html() to MsgId.get_html()

* incorporate more documentation suggestions from review

* remove unused return value from collect_texts_recursive()

* avoid mime_modified being mutable in write-parts-loop

* move 'use futures::future::FutureExt' atop of html.rs

* move attributes defining plain-text to a dedicated structure

* more PlainText to separate file

* escape cid when building regex

* let dc_get_msg_html() return NULL when calling with bad param
2021-01-11 17:40:35 +01:00
Hocuri
ec83fae314 Parse name="quote" divs (#2104)
fix #1560 Replies in html-only format are not converted nicely wrt Quoting
2020-12-13 18:02:20 +01:00
Hocuri
3c6d52842e Doc comments are show in HTML documentation.
This is not a proper documentation, just a note on implementation.
2020-10-19 13:07:55 +02:00
Hocuri
4d2542cee5 Don't show HTML if there is no content and there is a file attached
Fix https://github.com/deltachat/deltachat-core-rust/issues/1982
2020-10-19 13:07:55 +02:00
Alexander Krotov
67cddedf7e Switch from lazy_static to once_cell 2020-10-18 15:47:21 +03:00
Hocuri
3faf968b7c Fix tests 2020-08-19 20:03:08 +02:00
Hocuri
1a736ca6c3 Fix #1804: remove <!doctype html> and accept invalid HTML
This fixes #1804 in two ways: First, it removes a <!doctype html> from
the start of the mail, if there is any.

Then, it parses the html itself it quick-xml fails, just stripping
everything between < and >.

Both of these would have fixed this specific issue.

Also, add tests for both fixes.
2020-08-19 20:03:08 +02:00
Alexander Krotov
18d8ef9ffc dehtml: handle empty tags 2020-08-07 23:18:34 +03:00
Alexander Krotov
db5b5d321b clippy: remove redundant imports 2020-04-13 23:02:57 +03:00
Hocuri
134b09dba5 Fix #1373, ignore incorrect html close tags 2020-04-13 17:40:07 +02:00
Alexander Krotov
fe4080d59f refactor(simplify): move dehtml dependency to mimeparser
This change also removes unnecessary String clone for HTML messages.
2019-12-20 12:55:57 +01:00
Alexander Krotov
a242fcfd2c refactor(dehtml): remove Result unwrap in dehtml_starttag_cb() 2019-12-18 23:24:43 +03:00
Alexander Krotov
694d8fd6fb Move dc_dehtml to dehtml and remove unnecessary is_empty check 2019-12-01 13:37:37 +01:00