This change adds a test and updates mailparse from 0.15.0 to 0.16.0.
mailparse 0.16.0 includes a fix for the bug
that resulted in CRLF being included at the end of the body.
Workaround for the bug in the `pk_validate` function is also removed.
When receiving messages, blobs will be deduplicated with the new
function `create_and_deduplicate_from_bytes()`. For sending files, this
adds a new function `set_file_and_deduplicate()` instead of
deduplicating by default.
This is for
https://github.com/deltachat/deltachat-core-rust/issues/6265; read the
issue description there for more details.
TODO:
- [x] Set files as read-only
- [x] Don't do a write when the file is already identical
- [x] The first 32 chars or so of the 64-character hash are enough. I
calculated that if 10b people (i.e. all of humanity) use DC, and each of
them has 200k distinct blob files (I have 4k in my day-to-day account),
and we used 20 chars, then the expected value for the number of name
collisions would be ~0.0002 (and the probability that there is a least
one name collision is lower than that) [^1]. I added 12 more characters
to be on the super safe side, but this wouldn't be necessary and I could
also make it 20 instead of 32.
- Not 100% sure whether that's necessary at all - it would mainly be
necessary if we might hit a length limit on some file systems (the
blobdir is usually sth like
`accounts/2ff9fc096d2f46b6832b24a1ed99c0d6/dc.db-blobs` (53 chars), plus
64 chars for the filename would be 117).
- [x] "touch" the files to prevent them from being deleted
- [x] TODOs in the code
For later PRs:
- Replace `BlobObject::create(…)` with
`BlobObject::create_and_deduplicate(…)` in order to deduplicate
everytime core creates a file
- Modify JsonRPC to deduplicate blob files
- Possibly rename BlobObject.name to BlobObject.file in order to prevent
confusion (because `name` usually means "user-visible-name", not "name
of the file on disk").
[^1]: Calculated with both https://printfn.github.io/fend/ and
https://www.geogebra.org/calculator, both of which came to the same
result
([1](https://github.com/user-attachments/assets/bbb62550-3781-48b5-88b1-ba0e29c28c0d),
[2](https://github.com/user-attachments/assets/82171212-b797-4117-a39f-0e132eac7252))
---------
Co-authored-by: l <link2xt@testrun.org>
- This [is said to lead improve compilation
speed](https://matklad.github.io/2021/02/27/delete-cargo-integration-tests.html#Assorted-Tricks)
- When grepping for a function invocation, this makes it easy to see whether it's from a test or "real" code
- We're calling the files e.g. `chat_tests.rs` instead of `tests.rs` for the same reason why we moved `imap/mod.rs` to `imap.rs`: Otherwise, your editor always shows you that you're in the file `tests.rs` and you don't know which one.
This is only moving mimeparser and chat tests, because these were the
biggest files; we can move more files in subsequent PRs if we like it.