For SMTP errors we already format `last_send_error` with {:#},
but for IMAP errors we have formatted the errors with .to_string().
This resulted in errors such as
"Error: IMAP failed to connect to example.org:443:tls"
instead of
"Error: IMAP failed to connect to example.org:443:tls: Connection failure: Network is unreachable (os error 101)."
in the connectivity view HTML.
Large size of Mimefactor.render() futures
increases the size of all callers
down to set_config() and background_fetch().
I also had to Box::pin one call to fetch_new_msg_batch
and one call to fetch_single_msg.
I used some AI to draft a first version of this, and then reworked it.
This is one of multiple possibilities to fix
https://github.com/chatmail/core/issues/8041: For bots, the IncomingMsg
event is not emitted when the pre-message arrives, only when the
post-message arrives. Also, post-messages are downloaded immediately,
not after all the small messages are downloaded.
The `get_next_msgs()` API is deprecated. Instead, bots need to listen to
the IncomingMsg event in order to be notified about new events. Is this
acceptable for bots?
THE PROBLEM THAT WAS SOLVED BY THIS:
With pre-messages, it's hard for bots to wait for the message to be fully downloaded and then process it.
Up until now, bots used get_next_msgs() to query the unprocessed messages, then set last_msg_id after processing a message to that they won't process it again.
But this will now also return messages that were not fully downloaded.
ALTERNATIVES:
In the following, I will explain
the alternatives, and for why it's not so easy to just make the
`get_next_msgs()` API work. If it's not understandable, I'm happy to
elaborate more.
Core can't just completely ignore the pre-message for two reasons:
- If a post-message containing a Webxdc arrives later, and some webxdc updates arrive in the meantime, then these updates will be lost.
- The post-message doesn't contain the text (reasoning was to avoid duplicate text for people who didn't upgrade yet during the 2.43.0 rollout)
There are multiple solutions:
- Add the message as hidden in the database when the pre-message arrives.
- When the post-message arrives and we want to make it available for bots, we need to update the msg_id because of how the `get_next_msgs()` API works. This means that we need to update all webxdc updates that reference this msg_id.
- Alternatively, we could make webxdc's reference the rfc724_mid instead of the msg_id, so that we don't need stable msg_ids anymore
- Alternatively, we could deprecate `get_next_msgs()`, and ask bots to use plain events for message processing again. It's not that bad; worst case, the bot crashes and then forgets to react to some messages, but the user will just try again. And if some message makes the bot crash, then it might actually be good not to try and process it again.
- Store the pre-message text and `PostMsgMetadata` (or alternatively, the whole mime) in a new database table. Wait with processing it until the post-message arrives.
Additionally, the logic that small messages are downloaded before post-messages should be disabled for bots, in order to prevent reordering.
With `ORDER BY` statement SQLite searches
the `imap` table by `transport_id` and for each found row
scans the whole `imap_markseen` table.
Number of `imap` entries for each `transport_id`
is usually large as we need to know
which UIDs to delete on IMAP server
when deleting a message.
```
sqlite> EXPLAIN QUERY PLAN
SELECT imap.id, uid, folder FROM imap, imap_markseen
WHERE imap.id = imap_markseen.id
AND imap.transport_id=?
AND target = folder
ORDER BY folder, uid;
QUERY PLAN
|--SEARCH imap USING INDEX sqlite_autoindex_imap_1 (transport_id=?)
`--SCAN imap_markseen
```
Without `ORDER BY` statement SQLite scans `imap_markseen`
table which is expected to be small,
and then searches `imap` table by `rowid` for each found result.
```
sqlite> EXPLAIN QUERY PLAN
SELECT imap.id, uid, folder FROM imap, imap_markseen
WHERE imap.id = imap_markseen.id
AND imap.transport_id=?
AND target = folder;
QUERY PLAN
|--SCAN imap_markseen
`--SEARCH imap USING INTEGER PRIMARY KEY (rowid=?)
```
Query planning was tested with SQLite 3.52.0.
It is possible to explictly make
query planner move sorting to the last step
with `ORDER +folder, +uid`, but this is not recommended
in SQLite documentation
(see <https://www.sqlite.org/optoverview.html#uplus>).
It is also possible to add indexes,
but indexes use space,
adding them requires an SQL migration,
and each index needs to be updated so it will slow down writes.
Only inbox loop is changed because non-inbox loop is going to be removed
together with `mvbox_move`.
Added transport IDs to the log and logging around quota updates.
Removed some logs that add noise,
like logging that IDLE is supported each time right before using it.
Previously, if the mvbox_move folder is missing, then core will loop
infinitely, because `new_mail` is never set to false.
The fix is to first set `new_mail` to false, then return if the folder
is missing.
This is the bug @hpk42 experienced when commenting in
https://github.com/chatmail/core/issues/7989
---------
Co-authored-by: holger krekel <holger@merlinux.eu>
Don't depend on these 3 cleartext headers for the question whether we
download a message.
This PR will waste a bit of bandwidth for people who use the legacy
show_emails option; apart from that, there is no user-visible change
yet. It's a preparation for being able to remove these headers, in order
to further reduce unencrypted metadata.
Removing In-Reply-To and References will be easy; removing Chat-Version
must happen at least one release after the PR here is released, so that
people don't miss messages. Also, maybe some nerds depend on the
Chat-Version header for server-side filtering of messages, but we shall
have this discussion at some other time.
For the question whether a message should be moved, we do still depend
on them; this will be fixed with
https://github.com/chatmail/core/pull/7780.
When both this PR and #7780 are merged, we can stop requesting
Chat-Version header during prefetch.
Hostname resolution may timeout if DNS servers are not responding.
It is also not necessary to resolve fallback ICE server hostnames
if the user is not going to use calls.
Otherwise wrong IMAP client corresponding to a different transport
may pick up the job to mark the message as seen there,
and fail to do it as the message does not exist.
It may also mark the wrong message with the correct folder
and UID, but wrong IMAP server.
There was a comment that group messages should always be downloaded to avoid inconsistent group
state, but this is solved by the group consistency algo nowadays in the sense that inconsistent
group state won't spread to other members if we send to the group.
At urgent request from @hpk42, this adds a config option `team_profile`.
This option is only settable via SQLite (not exposed in the UI), and the
only thing it does is disabling synchronization of seen status.
I tested manually on my Android phone that it works.
Not straigthforward to write an automatic test, because we want to test that something
does _not_ happen (i.e. that the seen status is _not_ synchronized), and
it's not clear how long to wait before we check.
Probably it's fine to just not add a test.
This is what I tried:
```python
@pytest.mark.parametrize("team_profile", [True, False])
def test_markseen_basic(team_profile, acfactory):
"""
Test that seen status is synchronized iff `team_profile` isn't set.
"""
alice, bob = acfactory.get_online_accounts(2)
# Bob sets up a second device.
bob2 = bob.clone()
bob2.start_io()
alice_chat_bob = alice.create_chat(bob)
bob.create_chat(alice)
bob2.create_chat(alice)
alice_chat_bob.send_text("Hello Bob!")
message = bob.wait_for_incoming_msg()
message2 = bob2.wait_for_incoming_msg()
assert message2.get_snapshot().state == MessageState.IN_FRESH
message.mark_seen()
# PROBLEM: We're not waiting 'long enough',
# so, the 'state == MessageState.IN_SEEN' assertion below fails
bob.create_chat(bob).send_text("Self-sent message")
self_sent = bob2.wait_for_msg(EventType.MSGS_CHANGED)
assert self_sent.get_snapshot().text == "Self-sent message"
if team_profile:
assert message2.get_snapshot().state == MessageState.IN_FRESH
else:
assert message2.get_snapshot().state == MessageState.IN_SEEN
```
Currently there is a race between transports
to upload sync messages and delete them from `imap_send` table.
Sometimes mulitple transports upload the same message
and sometimes only some of them "win".
With this change only the primary transport
will upload the sync message.
For multiple transports we will need to run
multiple IMAP clients in parallel.
UID validity change detected by one IMAP client
should not result in UID resync
for another IMAP client.
`login_param` module is now for user-visible entered login parameters,
while the `transport` module contains structures for internal
representation of connection candidate list
created during transport configuration.
It's used in `fetch_existing_msgs()`, but we can remove it and tell users that they need to
move/copy messages from Sentbox to Inbox so that Delta Chat adds all contacts from them. This way
users will be also informed that Delta Chat needs users to CC/BCC/To themselves to see messages sent
from other MUAs.
The motivation is to reduce code complexity, get rid of the extra IMAP connection and cases when
messages are added to chats by Inbox and Sentbox loops in parallel which leads to various message
sorting bugs, particularly to outgoing messages breaking sorting of incoming ones which are fetched
later, but may have a smaller "Date".
We already have both rand 0.8 and rand 0.9
in our dependency tree.
We still need rand 0.8 because
public APIs of rPGP 0.17.0 and Iroh 0.35.0
use rand 0.8 types in public APIs,
so it is imported as rand_old.
We do not try to delete resent messages
anymore. Previously resent messages
were distinguised by having duplicate Message-ID,
but future Date, but now we need to download
the message before we even see the Date.
We now move the message to the destination folder
but do not fetch it.
It may not be a good idea to delete
the duplicate in multi-device setups anyway,
because the device which has a message
may delete the duplicate of a message
the other device missed.
To avoid triggering IMAP busy move loop
described in the comments
we now only move the messages
from INBOX and Spam folders.
I have logs from a user where messages are prefetched for long minutes, and while it's not a problem
on its own, we can't rely that the connection overlives such a period, so make
`fetch_new_messages()` prefetch (and then actually download) messages in batches of 500 messages.
Ignoring `receive_imf_inner()` errors, i.e. silently skipping messages on failures, leads to bugs
never fixed. As for temporary I/O errors, ignoring them leads to lost messages, in this case it's
better to bubble up the error and get the IMAP loop stuck. However if there's some logic error, it's
better to show it to the user so that it's more likely reported, and continue receiving messages. To
distinguish these cases, on error, try adding the message as partially downloaded with the error set
to `msgs.error`, this way the user also can retry downloading the message to finally see it if the
problem is fixed.
Actually this leads to fetching messages only from watched folders and Spam. Motivation:
- At least Gmail has virtual folders which aren't correctly detected as such, e.g. "Sent".
- At least Gmail has many virtual folders and scanning all of them takes significant time, 5-6 secs
in median for me. This slows down receiving new messages and consumes battery.
- Delta Chat shouldn't fetch messages from folders potentially created by other apps for their own
purposes. NB: All compatible Delta Chat forks should use the "DeltaChat" folder as mvbox.
- Fetching from folders that aren't watched, e.g. from "Sent", may lead to message ordering issues.