mirror of
https://github.com/chatmail/core.git
synced 2026-05-16 21:36:30 +03:00
add option to access original message (#2125)
* draft API to deal with uncut message texts * add column mime_modified * add mime_modified flag to MimeParser and save it in the database * save mime_headers also when mime_modified is set * cargo fmt * set mime_modified on parsed html-texts and when there are multiple alternative-parts; add test for that * prototype functions, add to repl and ffi * use correct mime_modified flag * basically parse Mime-Structure to HTML * add basic tests for HTML-parsing * convert text/plain to html for getting original * respect charset for plain texts * make test more specific * fix handling non-utf-8 charsets for plain messages * add test for plain_to_html() * add failing test for plaintext linkify * linkify urls in plain text * fix regex * plain text linkify: add failing test for encapsulated links as <https://domain.com> * plain text linkify: make encapsulated links as <https://domain.com> work * plain text linkify: require word boundary at beginning of link, add tests for that * plain text linkify: linkify emails * plain text: support format=flowed * plain text: support quotes * make clippy happy * set mime-modified also when simplify() cuts non-html messages, add tests for that * streamline mime recursion * repl tool: write original html to file for further processing * convert cid:- to data:-protocol * add a test for cid: to data: conversion * make clippy happy * fix html-tests to work with windows-lineends * clarify what the returned html-code may contain * add some more detailed doc comments * add mime_modified column only if not exist this additional check is needed as the column may added with another dbversion in some shipped beta-versions. * incorporate documentation suggestions from review * rename get_original_mime_html() to more simple get_html() * rename api is_mime_modified() to more simple has_html(); internally, mime_modified-flag stays as-is, however * rename MimeS to MimeMultipartType * do not set mime-modified flag for encrypted messages that need extra-handling for saved mime-structure * fix typo * move get_msg_html() to MsgId.get_html() * incorporate more documentation suggestions from review * remove unused return value from collect_texts_recursive() * avoid mime_modified being mutable in write-parts-loop * move 'use futures::future::FutureExt' atop of html.rs * move attributes defining plain-text to a dedicated structure * more PlainText to separate file * escape cid when building regex * let dc_get_msg_html() return NULL when calling with bad param
This commit is contained in:
@@ -1447,6 +1447,54 @@ int dc_set_chat_mute_duration (dc_context_t* context, ui
|
|||||||
char* dc_get_msg_info (dc_context_t* context, uint32_t msg_id);
|
char* dc_get_msg_info (dc_context_t* context, uint32_t msg_id);
|
||||||
|
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Get uncut message, if available.
|
||||||
|
*
|
||||||
|
* Delta Chat tries to break the message in simple parts as plain text or images
|
||||||
|
* that are retrieved using dc_msg_get_viewtype(), dc_msg_get_text(), dc_msg_get_file() and so on.
|
||||||
|
* This works totally fine for Delta Chat to Delta Chat communication,
|
||||||
|
* however, when the counterpart uses another E-Mail-client, this has limits:
|
||||||
|
*
|
||||||
|
* - even if we do some good job on removing quotes,
|
||||||
|
* sometimes one needs to see them
|
||||||
|
* - HTML-only messages might lose information on conversion to text,
|
||||||
|
* esp. when there are lots of embedded images
|
||||||
|
* - even if there is some plain text part for a HTML-message,
|
||||||
|
* this is often poor and not nicely usable due to long links
|
||||||
|
*
|
||||||
|
* In these cases, dc_msg_has_html() returns 1
|
||||||
|
* and you can ask dc_get_msg_html() for some HTML-code
|
||||||
|
* that shows the uncut text (which is close to the original)
|
||||||
|
* For simplicity, the function _always_ returns HTML-code,
|
||||||
|
* this removes the need for the UI
|
||||||
|
* to deal with different formatting options of PLAIN-parts.
|
||||||
|
*
|
||||||
|
* **Note:** The returned HTML-code may contain scripts,
|
||||||
|
* external images that may be misused as hidden read-receipts and so on.
|
||||||
|
* Taking care of these parts
|
||||||
|
* while maintaining compatibility with the then generated HTML-code
|
||||||
|
* is not easily doable, if at all.
|
||||||
|
* Eg. taking care of tags and attributes is not sufficient,
|
||||||
|
* we would have to deal with linked content (eg. script, css),
|
||||||
|
* text (eg. script-blocks) and values (eg. javascript-protocol) as well;
|
||||||
|
* on this level, we have to deal with encodings, browser peculiarities and so on -
|
||||||
|
* and would still risk to oversee something and to break things.
|
||||||
|
*
|
||||||
|
* To avoid starting this cat-and-mouse game,
|
||||||
|
* and to close this issue in a sustainable way,
|
||||||
|
* it is up to the UI to display the HTML-code in an **appropriate sandbox environment** -
|
||||||
|
* that may eg. be an external browser or a WebView with scripting disabled.
|
||||||
|
*
|
||||||
|
* @memberof dc_context_t
|
||||||
|
* @param context The context object object.
|
||||||
|
* @param msg_id The message id for which the uncut text should be loaded
|
||||||
|
* @return Uncut text as HTML.
|
||||||
|
* In case of errors, NULL is returned.
|
||||||
|
* The result must be released using dc_str_unref().
|
||||||
|
*/
|
||||||
|
char* dc_get_msg_html (dc_context_t* context, uint32_t msg_id);
|
||||||
|
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Get the raw mime-headers of the given message.
|
* Get the raw mime-headers of the given message.
|
||||||
* Raw headers are saved for incoming messages
|
* Raw headers are saved for incoming messages
|
||||||
@@ -3603,6 +3651,32 @@ int dc_msg_get_videochat_type (const dc_msg_t* msg);
|
|||||||
#define DC_VIDEOCHATTYPE_JITSI 2
|
#define DC_VIDEOCHATTYPE_JITSI 2
|
||||||
|
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Checks if the message has a full HTML version.
|
||||||
|
*
|
||||||
|
* Messages have a full HTML version
|
||||||
|
* if the original message _may_ contain important parts
|
||||||
|
* that are removed by some heuristics
|
||||||
|
* or if the message is just too long or too complex
|
||||||
|
* to get displayed properly by just using plain text.
|
||||||
|
* If so, the UI should offer a button as
|
||||||
|
* "Show full message" that shows the uncut message using dc_get_msg_html().
|
||||||
|
*
|
||||||
|
* Even if a "Show full message" button is recommended,
|
||||||
|
* the UI should display the text in the bubble
|
||||||
|
* using the normal dc_msg_get_text() function -
|
||||||
|
* which will still be fine in many cases.
|
||||||
|
*
|
||||||
|
* @memberof dc_msg_t
|
||||||
|
* @param msg The message object.
|
||||||
|
* @return 0=Message as displayed using dc_msg_get_text() is just fine;
|
||||||
|
* 1=The message has a full HTML version,
|
||||||
|
* should be displayed using dc_msg_get_text()
|
||||||
|
* and a button to show the full version should be offered
|
||||||
|
*/
|
||||||
|
int dc_msg_has_html (dc_msg_t* msg);
|
||||||
|
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Set the text of a message object.
|
* Set the text of a message object.
|
||||||
* This does not alter any information in the database; this may be done by dc_send_msg() later.
|
* This does not alter any information in the database; this may be done by dc_send_msg() later.
|
||||||
|
|||||||
@@ -1389,6 +1389,20 @@ pub unsafe extern "C" fn dc_get_msg_info(
|
|||||||
block_on(message::get_msg_info(&ctx, MsgId::new(msg_id))).strdup()
|
block_on(message::get_msg_info(&ctx, MsgId::new(msg_id))).strdup()
|
||||||
}
|
}
|
||||||
|
|
||||||
|
#[no_mangle]
|
||||||
|
pub unsafe extern "C" fn dc_get_msg_html(
|
||||||
|
context: *mut dc_context_t,
|
||||||
|
msg_id: u32,
|
||||||
|
) -> *mut libc::c_char {
|
||||||
|
if context.is_null() {
|
||||||
|
eprintln!("ignoring careless call to dc_get_msg_html()");
|
||||||
|
return ptr::null_mut();
|
||||||
|
}
|
||||||
|
let ctx = &*context;
|
||||||
|
|
||||||
|
block_on(MsgId::new(msg_id).get_html(&ctx)).strdup()
|
||||||
|
}
|
||||||
|
|
||||||
#[no_mangle]
|
#[no_mangle]
|
||||||
pub unsafe extern "C" fn dc_get_mime_headers(
|
pub unsafe extern "C" fn dc_get_mime_headers(
|
||||||
context: *mut dc_context_t,
|
context: *mut dc_context_t,
|
||||||
@@ -2861,6 +2875,16 @@ pub unsafe extern "C" fn dc_msg_is_setupmessage(msg: *mut dc_msg_t) -> libc::c_i
|
|||||||
ffi_msg.message.is_setupmessage().into()
|
ffi_msg.message.is_setupmessage().into()
|
||||||
}
|
}
|
||||||
|
|
||||||
|
#[no_mangle]
|
||||||
|
pub unsafe extern "C" fn dc_msg_has_html(msg: *mut dc_msg_t) -> libc::c_int {
|
||||||
|
if msg.is_null() {
|
||||||
|
eprintln!("ignoring careless call to dc_msg_has_html()");
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
let ffi_msg = &*msg;
|
||||||
|
ffi_msg.message.has_html().into()
|
||||||
|
}
|
||||||
|
|
||||||
#[no_mangle]
|
#[no_mangle]
|
||||||
pub unsafe extern "C" fn dc_msg_get_videochat_url(msg: *mut dc_msg_t) -> *mut libc::c_char {
|
pub unsafe extern "C" fn dc_msg_get_videochat_url(msg: *mut dc_msg_t) -> *mut libc::c_char {
|
||||||
if msg.is_null() {
|
if msg.is_null() {
|
||||||
|
|||||||
@@ -21,6 +21,7 @@ use deltachat::qr::*;
|
|||||||
use deltachat::sql;
|
use deltachat::sql;
|
||||||
use deltachat::EventType;
|
use deltachat::EventType;
|
||||||
use deltachat::{config, provider};
|
use deltachat::{config, provider};
|
||||||
|
use std::fs;
|
||||||
|
|
||||||
/// Reset database tables.
|
/// Reset database tables.
|
||||||
/// Argument is a bitmask, executing single or multiple actions in one call.
|
/// Argument is a bitmask, executing single or multiple actions in one call.
|
||||||
@@ -185,7 +186,7 @@ async fn log_msg(context: &Context, prefix: impl AsRef<str>, msg: &Message) {
|
|||||||
let temp2 = dc_timestamp_to_str(msg.get_timestamp());
|
let temp2 = dc_timestamp_to_str(msg.get_timestamp());
|
||||||
let msgtext = msg.get_text();
|
let msgtext = msg.get_text();
|
||||||
println!(
|
println!(
|
||||||
"{}{}{}{}: {} (Contact#{}): {} {}{}{}{}{} [{}]",
|
"{}{}{}{}: {} (Contact#{}): {} {}{}{}{}{}{} [{}]",
|
||||||
prefix.as_ref(),
|
prefix.as_ref(),
|
||||||
msg.get_id(),
|
msg.get_id(),
|
||||||
if msg.get_showpadlock() { "🔒" } else { "" },
|
if msg.get_showpadlock() { "🔒" } else { "" },
|
||||||
@@ -193,6 +194,7 @@ async fn log_msg(context: &Context, prefix: impl AsRef<str>, msg: &Message) {
|
|||||||
&contact_name,
|
&contact_name,
|
||||||
contact_id,
|
contact_id,
|
||||||
msgtext.unwrap_or_default(),
|
msgtext.unwrap_or_default(),
|
||||||
|
if msg.has_html() { "[HAS-HTML]️" } else { "" },
|
||||||
if msg.get_from_id() == 1 as libc::c_uint {
|
if msg.get_from_id() == 1 as libc::c_uint {
|
||||||
""
|
""
|
||||||
} else if msg.get_state() == MessageState::InSeen {
|
} else if msg.get_state() == MessageState::InSeen {
|
||||||
@@ -384,6 +386,7 @@ pub async fn cmdline(context: Context, line: &str, chat_id: &mut ChatId) -> Resu
|
|||||||
===========================Message commands==\n\
|
===========================Message commands==\n\
|
||||||
listmsgs <query>\n\
|
listmsgs <query>\n\
|
||||||
msginfo <msg-id>\n\
|
msginfo <msg-id>\n\
|
||||||
|
html <msg-id>\n\
|
||||||
listfresh\n\
|
listfresh\n\
|
||||||
forward <msg-id> <chat-id>\n\
|
forward <msg-id> <chat-id>\n\
|
||||||
markseen <msg-id>\n\
|
markseen <msg-id>\n\
|
||||||
@@ -942,6 +945,16 @@ pub async fn cmdline(context: Context, line: &str, chat_id: &mut ChatId) -> Resu
|
|||||||
let res = message::get_msg_info(&context, id).await;
|
let res = message::get_msg_info(&context, id).await;
|
||||||
println!("{}", res);
|
println!("{}", res);
|
||||||
}
|
}
|
||||||
|
"html" => {
|
||||||
|
ensure!(!arg1.is_empty(), "Argument <msg-id> missing.");
|
||||||
|
let id = MsgId::new(arg1.parse()?);
|
||||||
|
let file = dirs::home_dir()
|
||||||
|
.unwrap_or_default()
|
||||||
|
.join(format!("msg-{}.html", id.to_u32()));
|
||||||
|
let html = id.get_html(&context).await;
|
||||||
|
fs::write(&file, html)?;
|
||||||
|
println!("HTML written to: {:#?}", file);
|
||||||
|
}
|
||||||
"listfresh" => {
|
"listfresh" => {
|
||||||
let msglist = context.get_fresh_msgs().await;
|
let msglist = context.get_fresh_msgs().await;
|
||||||
|
|
||||||
|
|||||||
@@ -802,11 +802,24 @@ async fn add_parts(
|
|||||||
let mut parts = std::mem::replace(&mut mime_parser.parts, Vec::new());
|
let mut parts = std::mem::replace(&mut mime_parser.parts, Vec::new());
|
||||||
let server_folder = server_folder.as_ref().to_string();
|
let server_folder = server_folder.as_ref().to_string();
|
||||||
let is_system_message = mime_parser.is_system_message;
|
let is_system_message = mime_parser.is_system_message;
|
||||||
let mime_headers = if save_mime_headers {
|
|
||||||
|
// if indicated by the parser,
|
||||||
|
// we save the full mime-message and add a flag
|
||||||
|
// that the ui should show button to display the full message.
|
||||||
|
//
|
||||||
|
// (currently, we skip saving mime-messages for encrypted messages
|
||||||
|
// as there is probably no huge intersection between html-messages and encrypted messages,
|
||||||
|
// however, that should be doable, we need the decrypted mime-structure in this case)
|
||||||
|
|
||||||
|
// a flag used to avoid adding "show full message" button to multiple parts of the message.
|
||||||
|
let mut save_mime_modified = mime_parser.is_mime_modified && !mime_parser.was_encrypted();
|
||||||
|
|
||||||
|
let mime_headers = if save_mime_headers || save_mime_modified {
|
||||||
Some(String::from_utf8_lossy(imf_raw).to_string())
|
Some(String::from_utf8_lossy(imf_raw).to_string())
|
||||||
} else {
|
} else {
|
||||||
None
|
None
|
||||||
};
|
};
|
||||||
|
|
||||||
let sent_timestamp = *sent_timestamp;
|
let sent_timestamp = *sent_timestamp;
|
||||||
let is_hidden = *hidden;
|
let is_hidden = *hidden;
|
||||||
let chat_id = *chat_id;
|
let chat_id = *chat_id;
|
||||||
@@ -826,8 +839,9 @@ async fn add_parts(
|
|||||||
"INSERT INTO msgs \
|
"INSERT INTO msgs \
|
||||||
(rfc724_mid, server_folder, server_uid, chat_id, from_id, to_id, timestamp, \
|
(rfc724_mid, server_folder, server_uid, chat_id, from_id, to_id, timestamp, \
|
||||||
timestamp_sent, timestamp_rcvd, type, state, msgrmsg, txt, txt_raw, param, \
|
timestamp_sent, timestamp_rcvd, type, state, msgrmsg, txt, txt_raw, param, \
|
||||||
bytes, hidden, mime_headers, mime_in_reply_to, mime_references, error, ephemeral_timer, ephemeral_timestamp) \
|
bytes, hidden, mime_headers, mime_in_reply_to, mime_references, mime_modified, \
|
||||||
VALUES (?,?,?,?,?,?, ?,?,?,?,?,?, ?,?,?,?,?,?, ?,?, ?,?,?);",
|
error, ephemeral_timer, ephemeral_timestamp) \
|
||||||
|
VALUES (?,?,?,?,?,?,?, ?,?,?,?,?,?,?,?, ?,?,?,?,?,?, ?,?,?);",
|
||||||
)?;
|
)?;
|
||||||
|
|
||||||
let is_location_kml = location_kml_is
|
let is_location_kml = location_kml_is
|
||||||
@@ -841,6 +855,12 @@ async fn add_parts(
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
let mime_modified = save_mime_modified && !part.msg.is_empty();
|
||||||
|
if mime_modified {
|
||||||
|
// Avoid setting mime_modified for more than one part.
|
||||||
|
save_mime_modified = false;
|
||||||
|
}
|
||||||
|
|
||||||
if part.typ == Viewtype::Text {
|
if part.typ == Viewtype::Text {
|
||||||
let msg_raw = part.msg_raw.as_ref().cloned().unwrap_or_default();
|
let msg_raw = part.msg_raw.as_ref().cloned().unwrap_or_default();
|
||||||
txt_raw = format!("{}\n\n{}", subject, msg_raw);
|
txt_raw = format!("{}\n\n{}", subject, msg_raw);
|
||||||
@@ -854,7 +874,9 @@ async fn add_parts(
|
|||||||
} else {
|
} else {
|
||||||
match ephemeral_timer {
|
match ephemeral_timer {
|
||||||
EphemeralTimer::Disabled => 0,
|
EphemeralTimer::Disabled => 0,
|
||||||
EphemeralTimer::Enabled { duration } => rcvd_timestamp + i64::from(duration)
|
EphemeralTimer::Enabled { duration } => {
|
||||||
|
rcvd_timestamp + i64::from(duration)
|
||||||
|
}
|
||||||
}
|
}
|
||||||
};
|
};
|
||||||
|
|
||||||
@@ -877,9 +899,14 @@ async fn add_parts(
|
|||||||
part.param.to_string(),
|
part.param.to_string(),
|
||||||
part.bytes as isize,
|
part.bytes as isize,
|
||||||
is_hidden,
|
is_hidden,
|
||||||
mime_headers,
|
if save_mime_headers || mime_modified {
|
||||||
|
mime_headers.clone()
|
||||||
|
} else {
|
||||||
|
None
|
||||||
|
},
|
||||||
mime_in_reply_to,
|
mime_in_reply_to,
|
||||||
mime_references,
|
mime_references,
|
||||||
|
mime_modified,
|
||||||
part.error.take().unwrap_or_default(),
|
part.error.take().unwrap_or_default(),
|
||||||
ephemeral_timer,
|
ephemeral_timer,
|
||||||
ephemeral_timestamp
|
ephemeral_timestamp
|
||||||
|
|||||||
@@ -390,9 +390,10 @@ mod tests {
|
|||||||
let input = include_str!("../test-data/message/gmx-quote-body.eml");
|
let input = include_str!("../test-data/message/gmx-quote-body.eml");
|
||||||
let dehtml = dehtml(input).unwrap();
|
let dehtml = dehtml(input).unwrap();
|
||||||
println!("{}", dehtml);
|
println!("{}", dehtml);
|
||||||
let (msg, forwawded, top_quote) = simplify(dehtml, false);
|
let (msg, forwarded, cut, top_quote) = simplify(dehtml, false);
|
||||||
assert_eq!(msg, "Test");
|
assert_eq!(msg, "Test");
|
||||||
assert_eq!(forwawded, false);
|
assert_eq!(forwarded, false);
|
||||||
|
assert_eq!(cut, false);
|
||||||
assert_eq!(top_quote.as_deref(), Some("test"));
|
assert_eq!(top_quote.as_deref(), Some("test"));
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -21,6 +21,7 @@ pub enum HeaderDef {
|
|||||||
InReplyTo,
|
InReplyTo,
|
||||||
Precedence,
|
Precedence,
|
||||||
ContentType,
|
ContentType,
|
||||||
|
ContentId,
|
||||||
ChatVersion,
|
ChatVersion,
|
||||||
ChatGroupId,
|
ChatGroupId,
|
||||||
ChatGroupName,
|
ChatGroupName,
|
||||||
|
|||||||
396
src/html.rs
Normal file
396
src/html.rs
Normal file
@@ -0,0 +1,396 @@
|
|||||||
|
///! # Get message as HTML.
|
||||||
|
///!
|
||||||
|
///! Use `Message.has_html()` to check if the UI shall render a
|
||||||
|
///! corresponding button and `MsgId.get_html()` to get the full message.
|
||||||
|
///!
|
||||||
|
///! Even when the original mime-message is not HTML,
|
||||||
|
///! `MsgId.get_html()` will return HTML -
|
||||||
|
///! this allows nice quoting, handling linebreaks properly etc.
|
||||||
|
use futures::future::FutureExt;
|
||||||
|
use std::future::Future;
|
||||||
|
use std::pin::Pin;
|
||||||
|
|
||||||
|
use lettre_email::mime::{self, Mime};
|
||||||
|
|
||||||
|
use crate::context::Context;
|
||||||
|
use crate::error::Result;
|
||||||
|
use crate::headerdef::{HeaderDef, HeaderDefMap};
|
||||||
|
use crate::message::{Message, MsgId};
|
||||||
|
use crate::mimeparser::parse_message_id;
|
||||||
|
use crate::plaintext::PlainText;
|
||||||
|
use mailparse::ParsedContentType;
|
||||||
|
|
||||||
|
impl Message {
|
||||||
|
/// Check if the message can be retrieved as HTML.
|
||||||
|
/// Typically, this is the case, when the mime structure of a Message is modified,
|
||||||
|
/// meaning that some text is cut or the original message
|
||||||
|
/// is in HTML and `simplify()` may hide some maybe important information.
|
||||||
|
/// The corresponding ffi-function is `dc_msg_has_html()`.
|
||||||
|
/// To get the HTML-code of the message, use `MsgId.get_html()`.
|
||||||
|
pub fn has_html(&self) -> bool {
|
||||||
|
self.mime_modified
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Type defining a rough mime-type.
|
||||||
|
/// This is mainly useful on iterating
|
||||||
|
/// to decide whether a mime-part has subtypes.
|
||||||
|
enum MimeMultipartType {
|
||||||
|
Multiple,
|
||||||
|
Single,
|
||||||
|
Message,
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Function takes a content type from a ParsedMail structure
|
||||||
|
/// and checks and returns the rough mime-type.
|
||||||
|
async fn get_mime_multipart_type(ctype: &ParsedContentType) -> MimeMultipartType {
|
||||||
|
let mimetype = ctype.mimetype.to_lowercase();
|
||||||
|
if mimetype.starts_with("multipart") && ctype.params.get("boundary").is_some() {
|
||||||
|
MimeMultipartType::Multiple
|
||||||
|
} else if mimetype == "message/rfc822" {
|
||||||
|
MimeMultipartType::Message
|
||||||
|
} else {
|
||||||
|
MimeMultipartType::Single
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/// HtmlMsgParser converts a mime-message to HTML.
|
||||||
|
#[derive(Debug)]
|
||||||
|
struct HtmlMsgParser {
|
||||||
|
pub html: String,
|
||||||
|
pub plain: Option<PlainText>,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl HtmlMsgParser {
|
||||||
|
/// Function takes a raw mime-message string,
|
||||||
|
/// searches for the main-text part
|
||||||
|
/// and returns that as parser.html
|
||||||
|
pub async fn from_bytes(context: &Context, rawmime: &[u8]) -> Result<Self> {
|
||||||
|
let mut parser = HtmlMsgParser {
|
||||||
|
html: "".to_string(),
|
||||||
|
plain: None,
|
||||||
|
};
|
||||||
|
|
||||||
|
let parsedmail = mailparse::parse_mail(rawmime)?;
|
||||||
|
|
||||||
|
parser.collect_texts_recursive(context, &parsedmail).await?;
|
||||||
|
|
||||||
|
if parser.html.is_empty() {
|
||||||
|
if let Some(plain) = &parser.plain {
|
||||||
|
parser.html = plain.to_html().await;
|
||||||
|
}
|
||||||
|
} else {
|
||||||
|
parser.cid_to_data_recursive(context, &parsedmail).await?;
|
||||||
|
}
|
||||||
|
|
||||||
|
Ok(parser)
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Function iterates over all mime-parts
|
||||||
|
/// and searches for text/plain and text/html parts and saves the
|
||||||
|
/// last one found
|
||||||
|
/// in the corresponding structure fields.
|
||||||
|
/// Usually, there is at most one plain-text and one HTML-text part.
|
||||||
|
fn collect_texts_recursive<'a>(
|
||||||
|
&'a mut self,
|
||||||
|
context: &'a Context,
|
||||||
|
mail: &'a mailparse::ParsedMail<'a>,
|
||||||
|
) -> Pin<Box<dyn Future<Output = Result<()>> + 'a + Send>> {
|
||||||
|
// Boxed future to deal with recursion
|
||||||
|
async move {
|
||||||
|
match get_mime_multipart_type(&mail.ctype).await {
|
||||||
|
MimeMultipartType::Multiple => {
|
||||||
|
for cur_data in mail.subparts.iter() {
|
||||||
|
self.collect_texts_recursive(context, cur_data).await?
|
||||||
|
}
|
||||||
|
Ok(())
|
||||||
|
}
|
||||||
|
MimeMultipartType::Message => {
|
||||||
|
let raw = mail.get_body_raw()?;
|
||||||
|
if raw.is_empty() {
|
||||||
|
return Ok(());
|
||||||
|
}
|
||||||
|
let mail = mailparse::parse_mail(&raw).unwrap();
|
||||||
|
self.collect_texts_recursive(context, &mail).await
|
||||||
|
}
|
||||||
|
MimeMultipartType::Single => {
|
||||||
|
let mimetype = mail.ctype.mimetype.parse::<Mime>()?;
|
||||||
|
if mimetype == mime::TEXT_HTML {
|
||||||
|
if let Ok(decoded_data) = mail.get_body() {
|
||||||
|
self.html = decoded_data;
|
||||||
|
return Ok(());
|
||||||
|
}
|
||||||
|
} else if mimetype == mime::TEXT_PLAIN {
|
||||||
|
if let Ok(decoded_data) = mail.get_body() {
|
||||||
|
self.plain = Some(PlainText {
|
||||||
|
text: decoded_data,
|
||||||
|
flowed: if let Some(format) = mail.ctype.params.get("format") {
|
||||||
|
format.as_str().to_ascii_lowercase() == "flowed"
|
||||||
|
} else {
|
||||||
|
false
|
||||||
|
},
|
||||||
|
delsp: if let Some(delsp) = mail.ctype.params.get("delsp") {
|
||||||
|
delsp.as_str().to_ascii_lowercase() == "yes"
|
||||||
|
} else {
|
||||||
|
false
|
||||||
|
},
|
||||||
|
});
|
||||||
|
return Ok(());
|
||||||
|
}
|
||||||
|
}
|
||||||
|
Ok(())
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
.boxed()
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Replace cid:-protocol by the data:-protocol where appropriate.
|
||||||
|
/// This allows the final html-file to be self-contained.
|
||||||
|
fn cid_to_data_recursive<'a>(
|
||||||
|
&'a mut self,
|
||||||
|
context: &'a Context,
|
||||||
|
mail: &'a mailparse::ParsedMail<'a>,
|
||||||
|
) -> Pin<Box<dyn Future<Output = Result<()>> + 'a + Send>> {
|
||||||
|
// Boxed future to deal with recursion
|
||||||
|
async move {
|
||||||
|
match get_mime_multipart_type(&mail.ctype).await {
|
||||||
|
MimeMultipartType::Multiple => {
|
||||||
|
for cur_data in mail.subparts.iter() {
|
||||||
|
self.cid_to_data_recursive(context, cur_data).await?;
|
||||||
|
}
|
||||||
|
Ok(())
|
||||||
|
}
|
||||||
|
MimeMultipartType::Message => {
|
||||||
|
let raw = mail.get_body_raw()?;
|
||||||
|
if raw.is_empty() {
|
||||||
|
return Ok(());
|
||||||
|
}
|
||||||
|
let mail = mailparse::parse_mail(&raw).unwrap();
|
||||||
|
self.cid_to_data_recursive(context, &mail).await
|
||||||
|
}
|
||||||
|
MimeMultipartType::Single => {
|
||||||
|
let mimetype = mail.ctype.mimetype.parse::<Mime>()?;
|
||||||
|
if mimetype.type_() == mime::IMAGE {
|
||||||
|
if let Some(cid) = mail.headers.get_header_value(HeaderDef::ContentId) {
|
||||||
|
if let Ok(cid) = parse_message_id(&cid) {
|
||||||
|
if let Ok(replacement) = mimepart_to_data_url(&mail).await {
|
||||||
|
let re_string = format!(
|
||||||
|
"(<img[^>]*src[^>]*=[^>]*)(cid:{})([^>]*>)",
|
||||||
|
regex::escape(&cid)
|
||||||
|
);
|
||||||
|
match regex::Regex::new(&re_string) {
|
||||||
|
Ok(re) => {
|
||||||
|
self.html = re
|
||||||
|
.replace_all(
|
||||||
|
&*self.html,
|
||||||
|
format!("${{1}}{}${{3}}", replacement).as_str(),
|
||||||
|
)
|
||||||
|
.as_ref()
|
||||||
|
.to_string()
|
||||||
|
}
|
||||||
|
Err(e) => warn!(
|
||||||
|
context,
|
||||||
|
"Cannot create regex for cid: {} throws {}",
|
||||||
|
re_string,
|
||||||
|
e
|
||||||
|
),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
Ok(())
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
.boxed()
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Convert a mime part to a data: url as defined in [RFC 2397](https://tools.ietf.org/html/rfc2397).
|
||||||
|
async fn mimepart_to_data_url(mail: &mailparse::ParsedMail<'_>) -> Result<String> {
|
||||||
|
let data = mail.get_body_raw()?;
|
||||||
|
let data = base64::encode(&data);
|
||||||
|
Ok(format!("data:{};base64,{}", mail.ctype.mimetype, data))
|
||||||
|
}
|
||||||
|
|
||||||
|
impl MsgId {
|
||||||
|
/// Get HTML from a message-id.
|
||||||
|
/// This requires `mime_headers` field to be set for the message;
|
||||||
|
/// this is the case at least when `Message.has_html()` returns true
|
||||||
|
/// (we do not save raw mime unconditionally in the database to save space).
|
||||||
|
/// The corresponding ffi-function is `dc_get_msg_html()`.
|
||||||
|
pub async fn get_html(self, context: &Context) -> String {
|
||||||
|
let rawmime: Option<String> = context
|
||||||
|
.sql
|
||||||
|
.query_get_value(
|
||||||
|
context,
|
||||||
|
"SELECT mime_headers FROM msgs WHERE id=?;",
|
||||||
|
paramsv![self],
|
||||||
|
)
|
||||||
|
.await;
|
||||||
|
|
||||||
|
if let Some(rawmime) = rawmime {
|
||||||
|
match HtmlMsgParser::from_bytes(context, rawmime.as_bytes()).await {
|
||||||
|
Err(err) => format!("parser error: {}", err),
|
||||||
|
Ok(parser) => parser.html,
|
||||||
|
}
|
||||||
|
} else {
|
||||||
|
format!("parser error: no mime for {}", self)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
#[cfg(test)]
|
||||||
|
mod tests {
|
||||||
|
use super::*;
|
||||||
|
use crate::test_utils::*;
|
||||||
|
|
||||||
|
#[async_std::test]
|
||||||
|
async fn test_htmlparse_plain_unspecified() {
|
||||||
|
let t = TestContext::new().await;
|
||||||
|
let raw = include_bytes!("../test-data/message/text_plain_unspecified.eml");
|
||||||
|
let parser = HtmlMsgParser::from_bytes(&t.ctx, raw).await.unwrap();
|
||||||
|
assert_eq!(
|
||||||
|
parser.html,
|
||||||
|
r##"<!DOCTYPE html>
|
||||||
|
<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8" /></head><body>
|
||||||
|
This message does not have Content-Type nor Subject.<br/>
|
||||||
|
<br/>
|
||||||
|
</body></html>
|
||||||
|
"##
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[async_std::test]
|
||||||
|
async fn test_htmlparse_plain_iso88591() {
|
||||||
|
let t = TestContext::new().await;
|
||||||
|
let raw = include_bytes!("../test-data/message/text_plain_iso88591.eml");
|
||||||
|
let parser = HtmlMsgParser::from_bytes(&t.ctx, raw).await.unwrap();
|
||||||
|
assert_eq!(
|
||||||
|
parser.html,
|
||||||
|
r##"<!DOCTYPE html>
|
||||||
|
<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8" /></head><body>
|
||||||
|
message with a non-UTF-8 encoding: äöüßÄÖÜ<br/>
|
||||||
|
<br/>
|
||||||
|
</body></html>
|
||||||
|
"##
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[async_std::test]
|
||||||
|
async fn test_htmlparse_plain_flowed() {
|
||||||
|
let t = TestContext::new().await;
|
||||||
|
let raw = include_bytes!("../test-data/message/text_plain_flowed.eml");
|
||||||
|
let parser = HtmlMsgParser::from_bytes(&t.ctx, raw).await.unwrap();
|
||||||
|
assert!(parser.plain.unwrap().flowed);
|
||||||
|
assert_eq!(
|
||||||
|
parser.html,
|
||||||
|
r##"<!DOCTYPE html>
|
||||||
|
<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8" /></head><body>
|
||||||
|
This line ends with a space and will be merged with the next one due to format=flowed.<br/>
|
||||||
|
<br/>
|
||||||
|
This line does not end with a space<br/>
|
||||||
|
and will be wrapped as usual.<br/>
|
||||||
|
<br/>
|
||||||
|
</body></html>
|
||||||
|
"##
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[async_std::test]
|
||||||
|
async fn test_htmlparse_alt_plain() {
|
||||||
|
let t = TestContext::new().await;
|
||||||
|
let raw = include_bytes!("../test-data/message/text_alt_plain.eml");
|
||||||
|
let parser = HtmlMsgParser::from_bytes(&t.ctx, raw).await.unwrap();
|
||||||
|
assert_eq!(
|
||||||
|
parser.html,
|
||||||
|
r##"<!DOCTYPE html>
|
||||||
|
<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8" /></head><body>
|
||||||
|
mime-modified should not be set set as there is no html and no special stuff;<br/>
|
||||||
|
although not being a delta-message.<br/>
|
||||||
|
test some special html-characters as < > and & but also " and ' :)<br/>
|
||||||
|
<br/>
|
||||||
|
<br/>
|
||||||
|
</body></html>
|
||||||
|
"##
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[async_std::test]
|
||||||
|
async fn test_htmlparse_html() {
|
||||||
|
let t = TestContext::new().await;
|
||||||
|
let raw = include_bytes!("../test-data/message/text_html.eml");
|
||||||
|
let parser = HtmlMsgParser::from_bytes(&t.ctx, raw).await.unwrap();
|
||||||
|
|
||||||
|
// on windows, `\r\n` linends are returned from mimeparser,
|
||||||
|
// however, rust multiline-strings use just `\n`;
|
||||||
|
// therefore, we just remove `\r` before comparison.
|
||||||
|
assert_eq!(
|
||||||
|
parser.html.replace("\r", ""),
|
||||||
|
r##"
|
||||||
|
<html>
|
||||||
|
<p>mime-modified <b>set</b>; simplify is always regarded as lossy.</p>
|
||||||
|
</html>"##
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[async_std::test]
|
||||||
|
async fn test_htmlparse_alt_html() {
|
||||||
|
let t = TestContext::new().await;
|
||||||
|
let raw = include_bytes!("../test-data/message/text_alt_html.eml");
|
||||||
|
let parser = HtmlMsgParser::from_bytes(&t.ctx, raw).await.unwrap();
|
||||||
|
assert_eq!(
|
||||||
|
parser.html.replace("\r", ""), // see comment in test_htmlparse_html()
|
||||||
|
r##"<html>
|
||||||
|
<p>mime-modified <b>set</b>; simplify is always regarded as lossy.</p>
|
||||||
|
</html>
|
||||||
|
|
||||||
|
"##
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[async_std::test]
|
||||||
|
async fn test_htmlparse_alt_plain_html() {
|
||||||
|
let t = TestContext::new().await;
|
||||||
|
let raw = include_bytes!("../test-data/message/text_alt_plain_html.eml");
|
||||||
|
let parser = HtmlMsgParser::from_bytes(&t.ctx, raw).await.unwrap();
|
||||||
|
assert_eq!(
|
||||||
|
parser.html.replace("\r", ""), // see comment in test_htmlparse_html()
|
||||||
|
r##"<html>
|
||||||
|
<p>
|
||||||
|
this is <b>html</b>
|
||||||
|
</p>
|
||||||
|
</html>
|
||||||
|
|
||||||
|
"##
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[async_std::test]
|
||||||
|
async fn test_htmlparse_apple_cid_jpg() {
|
||||||
|
// load raw mime html-data with related image-part (cid:)
|
||||||
|
// and make sure, Content-Id has angle-brackets that are removed correctly.
|
||||||
|
let t = TestContext::new().await;
|
||||||
|
let raw = include_bytes!("../test-data/message/apple_cid_jpg.eml");
|
||||||
|
let test = String::from_utf8_lossy(raw);
|
||||||
|
assert!(test
|
||||||
|
.find("Content-Id: <8AE052EF-BC90-486F-BB78-58D3590308EC@fritz.box>")
|
||||||
|
.is_some());
|
||||||
|
assert!(test
|
||||||
|
.find("cid:8AE052EF-BC90-486F-BB78-58D3590308EC@fritz.box")
|
||||||
|
.is_some());
|
||||||
|
assert!(test.find("data:").is_none());
|
||||||
|
|
||||||
|
// parsing converts cid: to data:
|
||||||
|
let parser = HtmlMsgParser::from_bytes(&t.ctx, raw).await.unwrap();
|
||||||
|
assert!(parser.html.find("<html>").is_some());
|
||||||
|
assert!(parser.html.find("Content-Id:").is_none());
|
||||||
|
assert!(parser
|
||||||
|
.html
|
||||||
|
.find("data:image/jpeg;base64,/9j/4AAQ")
|
||||||
|
.is_some());
|
||||||
|
assert!(parser.html.find("cid:").is_none());
|
||||||
|
}
|
||||||
|
}
|
||||||
@@ -76,6 +76,8 @@ pub mod stock;
|
|||||||
mod token;
|
mod token;
|
||||||
#[macro_use]
|
#[macro_use]
|
||||||
mod dehtml;
|
mod dehtml;
|
||||||
|
pub mod html;
|
||||||
|
pub mod plaintext;
|
||||||
|
|
||||||
pub mod dc_receive_imf;
|
pub mod dc_receive_imf;
|
||||||
pub mod dc_tools;
|
pub mod dc_tools;
|
||||||
|
|||||||
@@ -110,7 +110,7 @@ impl MsgId {
|
|||||||
context
|
context
|
||||||
.sql
|
.sql
|
||||||
.execute(
|
.execute(
|
||||||
"UPDATE msgs SET chat_id=?, txt='', txt_raw='', from_id=0, to_id=0, param='' WHERE id=?",
|
"UPDATE msgs SET chat_id=?, txt='', txt_raw='', mime_headers='', from_id=0, to_id=0, param='' WHERE id=?",
|
||||||
paramsv![chat_id, self],
|
paramsv![chat_id, self],
|
||||||
)
|
)
|
||||||
.await?;
|
.await?;
|
||||||
@@ -264,6 +264,7 @@ pub struct Message {
|
|||||||
pub(crate) server_folder: Option<String>,
|
pub(crate) server_folder: Option<String>,
|
||||||
pub(crate) server_uid: u32,
|
pub(crate) server_uid: u32,
|
||||||
pub(crate) is_dc_message: MessengerMessage,
|
pub(crate) is_dc_message: MessengerMessage,
|
||||||
|
pub(crate) mime_modified: bool,
|
||||||
pub(crate) chat_blocked: Blocked,
|
pub(crate) chat_blocked: Blocked,
|
||||||
pub(crate) location_id: u32,
|
pub(crate) location_id: u32,
|
||||||
error: Option<String>,
|
error: Option<String>,
|
||||||
@@ -305,6 +306,7 @@ impl Message {
|
|||||||
" m.state AS state,",
|
" m.state AS state,",
|
||||||
" m.error AS error,",
|
" m.error AS error,",
|
||||||
" m.msgrmsg AS msgrmsg,",
|
" m.msgrmsg AS msgrmsg,",
|
||||||
|
" m.mime_modified AS mime_modified,",
|
||||||
" m.txt AS txt,",
|
" m.txt AS txt,",
|
||||||
" m.param AS param,",
|
" m.param AS param,",
|
||||||
" m.hidden AS hidden,",
|
" m.hidden AS hidden,",
|
||||||
@@ -335,6 +337,7 @@ impl Message {
|
|||||||
let error: String = row.get("error")?;
|
let error: String = row.get("error")?;
|
||||||
msg.error = Some(error).filter(|error| !error.is_empty());
|
msg.error = Some(error).filter(|error| !error.is_empty());
|
||||||
msg.is_dc_message = row.get("msgrmsg")?;
|
msg.is_dc_message = row.get("msgrmsg")?;
|
||||||
|
msg.mime_modified = row.get("mime_modified")?;
|
||||||
|
|
||||||
let text;
|
let text;
|
||||||
if let rusqlite::types::ValueRef::Text(buf) = row.get_raw("txt") {
|
if let rusqlite::types::ValueRef::Text(buf) = row.get_raw("txt") {
|
||||||
|
|||||||
@@ -65,6 +65,10 @@ pub struct MimeMessage {
|
|||||||
pub(crate) group_avatar: Option<AvatarAction>,
|
pub(crate) group_avatar: Option<AvatarAction>,
|
||||||
pub(crate) mdn_reports: Vec<Report>,
|
pub(crate) mdn_reports: Vec<Report>,
|
||||||
pub(crate) failure_report: Option<FailureReport>,
|
pub(crate) failure_report: Option<FailureReport>,
|
||||||
|
|
||||||
|
// if this flag is set, the parts/text/etc. are just close to the original mime-message;
|
||||||
|
// clients should offer a way to view the original message in this case
|
||||||
|
pub is_mime_modified: bool,
|
||||||
}
|
}
|
||||||
|
|
||||||
#[derive(Debug, PartialEq)]
|
#[derive(Debug, PartialEq)]
|
||||||
@@ -223,6 +227,7 @@ impl MimeMessage {
|
|||||||
user_avatar: None,
|
user_avatar: None,
|
||||||
group_avatar: None,
|
group_avatar: None,
|
||||||
failure_report: None,
|
failure_report: None,
|
||||||
|
is_mime_modified: false,
|
||||||
};
|
};
|
||||||
parser.parse_mime_recursive(context, &mail).await?;
|
parser.parse_mime_recursive(context, &mail).await?;
|
||||||
parser.maybe_remove_bad_parts();
|
parser.maybe_remove_bad_parts();
|
||||||
@@ -598,6 +603,12 @@ impl MimeMessage {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
if any_part_added && mail.subparts.len() > 1 {
|
||||||
|
// there are other alternative parts, likely HTML,
|
||||||
|
// so we might have missed some content on simplifying.
|
||||||
|
// set mime-modified to force the ui to display a show-message button.
|
||||||
|
self.is_mime_modified = true;
|
||||||
|
}
|
||||||
}
|
}
|
||||||
(mime::MULTIPART, "encrypted") => {
|
(mime::MULTIPART, "encrypted") => {
|
||||||
// we currently do not try to decrypt non-autocrypt messages
|
// we currently do not try to decrypt non-autocrypt messages
|
||||||
@@ -727,20 +738,26 @@ impl MimeMessage {
|
|||||||
|
|
||||||
let mut dehtml_failed = false;
|
let mut dehtml_failed = false;
|
||||||
|
|
||||||
let (simplified_txt, is_forwarded, top_quote) = if decoded_data.is_empty() {
|
let (simplified_txt, is_forwarded, is_cut, top_quote) =
|
||||||
("".to_string(), false, None)
|
if decoded_data.is_empty() {
|
||||||
} else {
|
("".to_string(), false, false, None)
|
||||||
let is_html = mime_type == mime::TEXT_HTML;
|
|
||||||
let out = if is_html {
|
|
||||||
dehtml(&decoded_data).unwrap_or_else(|| {
|
|
||||||
dehtml_failed = true;
|
|
||||||
decoded_data.clone()
|
|
||||||
})
|
|
||||||
} else {
|
} else {
|
||||||
decoded_data.clone()
|
let is_html = mime_type == mime::TEXT_HTML;
|
||||||
|
let out = if is_html {
|
||||||
|
self.is_mime_modified = true;
|
||||||
|
dehtml(&decoded_data).unwrap_or_else(|| {
|
||||||
|
dehtml_failed = true;
|
||||||
|
decoded_data.clone()
|
||||||
|
})
|
||||||
|
} else {
|
||||||
|
decoded_data.clone()
|
||||||
|
};
|
||||||
|
simplify(out, self.has_chat_version())
|
||||||
};
|
};
|
||||||
simplify(out, self.has_chat_version())
|
|
||||||
};
|
self.is_mime_modified = self.is_mime_modified
|
||||||
|
|| ((is_forwarded || is_cut || top_quote.is_some())
|
||||||
|
&& !self.has_chat_version());
|
||||||
|
|
||||||
let is_format_flowed = if let Some(format) = mail.ctype.params.get("format")
|
let is_format_flowed = if let Some(format) = mail.ctype.params.get("format")
|
||||||
{
|
{
|
||||||
@@ -2562,4 +2579,67 @@ On 2020-10-25, Bob wrote:
|
|||||||
assert_eq!(msg.get_height(), 64);
|
assert_eq!(msg.get_height(), 64);
|
||||||
assert_eq!(msg.get_filemime().unwrap(), "image/png");
|
assert_eq!(msg.get_filemime().unwrap(), "image/png");
|
||||||
}
|
}
|
||||||
|
|
||||||
|
#[async_std::test]
|
||||||
|
async fn test_mime_modified_plain() {
|
||||||
|
let t = TestContext::new().await;
|
||||||
|
let raw = include_bytes!("../test-data/message/text_plain_unspecified.eml");
|
||||||
|
let mimeparser = MimeMessage::from_bytes(&t.ctx, raw).await.unwrap();
|
||||||
|
assert!(!mimeparser.is_mime_modified);
|
||||||
|
assert_eq!(
|
||||||
|
mimeparser.parts[0].msg,
|
||||||
|
"This message does not have Content-Type nor Subject."
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[async_std::test]
|
||||||
|
async fn test_mime_modified_alt_plain_html() {
|
||||||
|
let t = TestContext::new().await;
|
||||||
|
let raw = include_bytes!("../test-data/message/text_alt_plain_html.eml");
|
||||||
|
let mimeparser = MimeMessage::from_bytes(&t.ctx, raw).await.unwrap();
|
||||||
|
assert!(mimeparser.is_mime_modified);
|
||||||
|
assert_eq!(
|
||||||
|
mimeparser.parts[0].msg,
|
||||||
|
"mime-modified test – this is plain"
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[async_std::test]
|
||||||
|
async fn test_mime_modified_alt_plain() {
|
||||||
|
let t = TestContext::new().await;
|
||||||
|
let raw = include_bytes!("../test-data/message/text_alt_plain.eml");
|
||||||
|
let mimeparser = MimeMessage::from_bytes(&t.ctx, raw).await.unwrap();
|
||||||
|
assert!(!mimeparser.is_mime_modified);
|
||||||
|
assert_eq!(
|
||||||
|
mimeparser.parts[0].msg,
|
||||||
|
"mime-modified test – \
|
||||||
|
mime-modified should not be set set as there is no html and no special stuff;\n\
|
||||||
|
although not being a delta-message.\n\
|
||||||
|
test some special html-characters as < > and & but also \" and ' :)"
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[async_std::test]
|
||||||
|
async fn test_mime_modified_alt_html() {
|
||||||
|
let t = TestContext::new().await;
|
||||||
|
let raw = include_bytes!("../test-data/message/text_alt_html.eml");
|
||||||
|
let mimeparser = MimeMessage::from_bytes(&t.ctx, raw).await.unwrap();
|
||||||
|
assert!(mimeparser.is_mime_modified);
|
||||||
|
assert_eq!(
|
||||||
|
mimeparser.parts[0].msg,
|
||||||
|
"mime-modified test – mime-modified *set*; simplify is always regarded as lossy."
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[async_std::test]
|
||||||
|
async fn test_mime_modified_html() {
|
||||||
|
let t = TestContext::new().await;
|
||||||
|
let raw = include_bytes!("../test-data/message/text_html.eml");
|
||||||
|
let mimeparser = MimeMessage::from_bytes(&t.ctx, raw).await.unwrap();
|
||||||
|
assert!(mimeparser.is_mime_modified);
|
||||||
|
assert_eq!(
|
||||||
|
mimeparser.parts[0].msg,
|
||||||
|
"mime-modified test – mime-modified *set*; simplify is always regarded as lossy."
|
||||||
|
);
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|||||||
252
src/plaintext.rs
Normal file
252
src/plaintext.rs
Normal file
@@ -0,0 +1,252 @@
|
|||||||
|
///! Handle plain text together with some attributes.
|
||||||
|
use crate::simplify::split_lines;
|
||||||
|
use once_cell::sync::Lazy;
|
||||||
|
|
||||||
|
#[derive(Debug)]
|
||||||
|
pub struct PlainText {
|
||||||
|
pub text: String,
|
||||||
|
|
||||||
|
/// Text may "flowed" as defined in [RFC 2646](https://tools.ietf.org/html/rfc2646).
|
||||||
|
/// At a glance, that means, if a line ends with a space, it is merged with the next one
|
||||||
|
/// and the first leading spaces is ignored
|
||||||
|
/// (to allow lines starting with `>` that normally indicates a quote)
|
||||||
|
pub flowed: bool,
|
||||||
|
|
||||||
|
/// If set together with "flowed",
|
||||||
|
/// The space indicating merging two lines is removed.
|
||||||
|
pub delsp: bool,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl PlainText {
|
||||||
|
/// Convert plain text to HTML.
|
||||||
|
/// The function handles quotes, links, fixed and floating text paragraphs.
|
||||||
|
pub async fn to_html(&self) -> String {
|
||||||
|
static LINKIFY_MAIL_RE: Lazy<regex::Regex> =
|
||||||
|
Lazy::new(|| regex::Regex::new(r#"\b([\w.\-+]+@[\w.\-]+)\b"#).unwrap());
|
||||||
|
|
||||||
|
static LINKIFY_URL_RE: Lazy<regex::Regex> = Lazy::new(|| {
|
||||||
|
regex::Regex::new(r#"\b((http|https|ftp|ftps):[\w.,:;$/@!?&%\-~=#+]+)"#).unwrap()
|
||||||
|
});
|
||||||
|
|
||||||
|
let lines = split_lines(&self.text);
|
||||||
|
|
||||||
|
let mut ret =
|
||||||
|
"<!DOCTYPE html>\n<html><head><meta http-equiv=\"Content-Type\" content=\"text/html; charset=utf-8\" /></head><body>\n".to_string();
|
||||||
|
|
||||||
|
for line in lines {
|
||||||
|
let is_quote = line.starts_with('>');
|
||||||
|
|
||||||
|
// we need to do html-entity-encoding after linkify, as otherwise encapsulated links
|
||||||
|
// as <http://example.org> cannot be handled correctly
|
||||||
|
// (they would become <http://example.org> where the trailing > would become a valid url part).
|
||||||
|
// to avoid double encoding, we escape our html-entities by \r that must not be used in the string elsewhere.
|
||||||
|
let line = line.to_string().replace("\r", "");
|
||||||
|
|
||||||
|
let mut line = LINKIFY_MAIL_RE
|
||||||
|
.replace_all(&*line, "\rLTa href=\rQUOTmailto:$1\rQUOT\rGT$1\rLT/a\rGT")
|
||||||
|
.as_ref()
|
||||||
|
.to_string();
|
||||||
|
|
||||||
|
line = LINKIFY_URL_RE
|
||||||
|
.replace_all(&*line, "\rLTa href=\rQUOT$1\rQUOT\rGT$1\rLT/a\rGT")
|
||||||
|
.as_ref()
|
||||||
|
.to_string();
|
||||||
|
|
||||||
|
// encode html-entities after linkify the raw string
|
||||||
|
line = escaper::encode_minimal(&line);
|
||||||
|
|
||||||
|
// make our escaped html-entities real after encoding all others
|
||||||
|
line = line.replace("\rLT", "<");
|
||||||
|
line = line.replace("\rGT", ">");
|
||||||
|
line = line.replace("\rQUOT", "\"");
|
||||||
|
|
||||||
|
if self.flowed {
|
||||||
|
// flowed text as of RFC 3676 -
|
||||||
|
// a leading space shall be removed
|
||||||
|
// and is only there to allow > at the beginning of a line that is no quote.
|
||||||
|
line = line.strip_prefix(" ").unwrap_or(&line).to_string();
|
||||||
|
if is_quote {
|
||||||
|
line = "<em>".to_owned() + &line + "</em>";
|
||||||
|
}
|
||||||
|
|
||||||
|
// a trailing space indicates that the line can be merged with the next one;
|
||||||
|
// for sake of simplicity, we skip merging for quotes (quotes may be combined with
|
||||||
|
// delsp, so `> >` is different from `>>` etc. see RFC 3676 for details)
|
||||||
|
if line.ends_with(' ') && !is_quote {
|
||||||
|
if self.delsp {
|
||||||
|
line.pop();
|
||||||
|
}
|
||||||
|
} else {
|
||||||
|
line += "<br/>\n";
|
||||||
|
}
|
||||||
|
} else {
|
||||||
|
// normal, fixed text
|
||||||
|
if is_quote {
|
||||||
|
line = "<em>".to_owned() + &line + "</em>";
|
||||||
|
}
|
||||||
|
line += "<br/>\n";
|
||||||
|
}
|
||||||
|
|
||||||
|
ret += &*line;
|
||||||
|
}
|
||||||
|
ret += "</body></html>\n";
|
||||||
|
ret
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
#[cfg(test)]
|
||||||
|
mod tests {
|
||||||
|
use super::*;
|
||||||
|
|
||||||
|
#[async_std::test]
|
||||||
|
async fn test_plain_to_html() {
|
||||||
|
let html = PlainText {
|
||||||
|
text: r##"line 1
|
||||||
|
line 2
|
||||||
|
line with https://link-mid-of-line.org and http://link-end-of-line.com/file?foo=bar%20
|
||||||
|
http://link-at-start-of-line.org
|
||||||
|
"##
|
||||||
|
.to_string(),
|
||||||
|
flowed: false,
|
||||||
|
delsp: false,
|
||||||
|
}
|
||||||
|
.to_html()
|
||||||
|
.await;
|
||||||
|
assert_eq!(
|
||||||
|
html,
|
||||||
|
r##"<!DOCTYPE html>
|
||||||
|
<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8" /></head><body>
|
||||||
|
line 1<br/>
|
||||||
|
line 2<br/>
|
||||||
|
line with <a href="https://link-mid-of-line.org">https://link-mid-of-line.org</a> and <a href="http://link-end-of-line.com/file?foo=bar%20">http://link-end-of-line.com/file?foo=bar%20</a><br/>
|
||||||
|
<a href="http://link-at-start-of-line.org">http://link-at-start-of-line.org</a><br/>
|
||||||
|
<br/>
|
||||||
|
</body></html>
|
||||||
|
"##
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[async_std::test]
|
||||||
|
async fn test_plain_to_html_encapsulated() {
|
||||||
|
let html = PlainText {
|
||||||
|
text: r#"line with <http://encapsulated.link/?foo=_bar> here!"#.to_string(),
|
||||||
|
flowed: false,
|
||||||
|
delsp: false,
|
||||||
|
}
|
||||||
|
.to_html()
|
||||||
|
.await;
|
||||||
|
assert_eq!(
|
||||||
|
html,
|
||||||
|
r#"<!DOCTYPE html>
|
||||||
|
<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8" /></head><body>
|
||||||
|
line with <<a href="http://encapsulated.link/?foo=_bar">http://encapsulated.link/?foo=_bar</a>> here!<br/>
|
||||||
|
</body></html>
|
||||||
|
"#
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[async_std::test]
|
||||||
|
async fn test_plain_to_html_nolink() {
|
||||||
|
let html = PlainText {
|
||||||
|
text: r#"line with nohttp://no.link here"#.to_string(),
|
||||||
|
flowed: false,
|
||||||
|
delsp: false,
|
||||||
|
}
|
||||||
|
.to_html()
|
||||||
|
.await;
|
||||||
|
assert_eq!(
|
||||||
|
html,
|
||||||
|
r#"<!DOCTYPE html>
|
||||||
|
<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8" /></head><body>
|
||||||
|
line with nohttp://no.link here<br/>
|
||||||
|
</body></html>
|
||||||
|
"#
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[async_std::test]
|
||||||
|
async fn test_plain_to_html_mailto() {
|
||||||
|
let html = PlainText {
|
||||||
|
text: r#"just an address: foo@bar.org another@one.de"#.to_string(),
|
||||||
|
flowed: false,
|
||||||
|
delsp: false,
|
||||||
|
}
|
||||||
|
.to_html()
|
||||||
|
.await;
|
||||||
|
assert_eq!(
|
||||||
|
html,
|
||||||
|
r#"<!DOCTYPE html>
|
||||||
|
<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8" /></head><body>
|
||||||
|
just an address: <a href="mailto:foo@bar.org">foo@bar.org</a> <a href="mailto:another@one.de">another@one.de</a><br/>
|
||||||
|
</body></html>
|
||||||
|
"#
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[async_std::test]
|
||||||
|
async fn test_plain_to_html_flowed() {
|
||||||
|
let html = PlainText {
|
||||||
|
text: "line \nstill line\n>quote \n>still quote\n >no quote".to_string(),
|
||||||
|
flowed: true,
|
||||||
|
delsp: false,
|
||||||
|
}
|
||||||
|
.to_html()
|
||||||
|
.await;
|
||||||
|
assert_eq!(
|
||||||
|
html,
|
||||||
|
r#"<!DOCTYPE html>
|
||||||
|
<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8" /></head><body>
|
||||||
|
line still line<br/>
|
||||||
|
<em>>quote </em><br/>
|
||||||
|
<em>>still quote</em><br/>
|
||||||
|
>no quote<br/>
|
||||||
|
</body></html>
|
||||||
|
"#
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[async_std::test]
|
||||||
|
async fn test_plain_to_html_flowed_delsp() {
|
||||||
|
let html = PlainText {
|
||||||
|
text: "line \nstill line\n>quote \n>still quote\n >no quote".to_string(),
|
||||||
|
flowed: true,
|
||||||
|
delsp: true,
|
||||||
|
}
|
||||||
|
.to_html()
|
||||||
|
.await;
|
||||||
|
assert_eq!(
|
||||||
|
html,
|
||||||
|
r#"<!DOCTYPE html>
|
||||||
|
<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8" /></head><body>
|
||||||
|
linestill line<br/>
|
||||||
|
<em>>quote </em><br/>
|
||||||
|
<em>>still quote</em><br/>
|
||||||
|
>no quote<br/>
|
||||||
|
</body></html>
|
||||||
|
"#
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[async_std::test]
|
||||||
|
async fn test_plain_to_html_fixed() {
|
||||||
|
let html = PlainText {
|
||||||
|
text: "line \nstill line\n>quote \n>still quote\n >no quote".to_string(),
|
||||||
|
flowed: false,
|
||||||
|
delsp: false,
|
||||||
|
}
|
||||||
|
.to_html()
|
||||||
|
.await;
|
||||||
|
assert_eq!(
|
||||||
|
html,
|
||||||
|
r#"<!DOCTYPE html>
|
||||||
|
<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8" /></head><body>
|
||||||
|
line <br/>
|
||||||
|
still line<br/>
|
||||||
|
<em>>quote </em><br/>
|
||||||
|
<em>>still quote</em><br/>
|
||||||
|
>no quote<br/>
|
||||||
|
</body></html>
|
||||||
|
"#
|
||||||
|
);
|
||||||
|
}
|
||||||
|
}
|
||||||
@@ -17,13 +17,16 @@ pub fn escape_message_footer_marks(text: &str) -> String {
|
|||||||
}
|
}
|
||||||
|
|
||||||
/// Remove standard (RFC 3676, §4.3) footer if it is found.
|
/// Remove standard (RFC 3676, §4.3) footer if it is found.
|
||||||
|
/// Returns `(lines, is_footer_removed)` tuple;
|
||||||
|
/// `is_footer_removed` is set to `true` if the footer was actually removed from `lines`
|
||||||
|
/// (which is equal to the input array otherwise).
|
||||||
#[allow(clippy::indexing_slicing)]
|
#[allow(clippy::indexing_slicing)]
|
||||||
fn remove_message_footer<'a>(lines: &'a [&str]) -> &'a [&'a str] {
|
fn remove_message_footer<'a>(lines: &'a [&str]) -> (&'a [&'a str], bool) {
|
||||||
let mut nearly_standard_footer = None;
|
let mut nearly_standard_footer = None;
|
||||||
for (ix, &line) in lines.iter().enumerate() {
|
for (ix, &line) in lines.iter().enumerate() {
|
||||||
match line {
|
match line {
|
||||||
// some providers encode `-- ` to `-- =20` which results in `-- `
|
// some providers encode `-- ` to `-- =20` which results in `-- `
|
||||||
"-- " | "-- " => return &lines[..ix],
|
"-- " | "-- " => return (&lines[..ix], true),
|
||||||
// some providers encode `-- ` to `=2D-` which results in only `--`;
|
// some providers encode `-- ` to `=2D-` which results in only `--`;
|
||||||
// use that only when no other footer is found
|
// use that only when no other footer is found
|
||||||
// and if the line before is empty and the line after is not empty
|
// and if the line before is empty and the line after is not empty
|
||||||
@@ -37,13 +40,15 @@ fn remove_message_footer<'a>(lines: &'a [&str]) -> &'a [&'a str] {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
if let Some(ix) = nearly_standard_footer {
|
if let Some(ix) = nearly_standard_footer {
|
||||||
return &lines[..ix];
|
return (&lines[..ix], true);
|
||||||
}
|
}
|
||||||
lines
|
(lines, false)
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Remove nonstandard footer and a boolean indicating whether such
|
/// Remove nonstandard footer and a boolean indicating whether such footer was removed.
|
||||||
/// footer was removed.
|
/// Returns `(lines, is_footer_removed)` tuple;
|
||||||
|
/// `is_footer_removed` is set to `true` if the footer was actually removed from `lines`
|
||||||
|
/// (which is equal to the input array otherwise).
|
||||||
#[allow(clippy::indexing_slicing)]
|
#[allow(clippy::indexing_slicing)]
|
||||||
fn remove_nonstandard_footer<'a>(lines: &'a [&str]) -> (&'a [&'a str], bool) {
|
fn remove_nonstandard_footer<'a>(lines: &'a [&str]) -> (&'a [&'a str], bool) {
|
||||||
for (ix, &line) in lines.iter().enumerate() {
|
for (ix, &line) in lines.iter().enumerate() {
|
||||||
@@ -60,20 +65,25 @@ fn remove_nonstandard_footer<'a>(lines: &'a [&str]) -> (&'a [&'a str], bool) {
|
|||||||
(lines, false)
|
(lines, false)
|
||||||
}
|
}
|
||||||
|
|
||||||
fn split_lines(buf: &str) -> Vec<&str> {
|
pub(crate) fn split_lines(buf: &str) -> Vec<&str> {
|
||||||
buf.split('\n').collect()
|
buf.split('\n').collect()
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Simplify message text for chat display.
|
/// Simplify message text for chat display.
|
||||||
/// Remove quotes, signatures, trailing empty lines etc.
|
/// Remove quotes, signatures, trailing empty lines etc.
|
||||||
pub fn simplify(mut input: String, is_chat_message: bool) -> (String, bool, Option<String>) {
|
/// Returns `(text, is_forwarded, is_cut, quote)` tuple,
|
||||||
|
/// returning the simplified text and some additional information gained from the input.
|
||||||
|
pub fn simplify(mut input: String, is_chat_message: bool) -> (String, bool, bool, Option<String>) {
|
||||||
|
let mut is_cut = false;
|
||||||
|
|
||||||
input.retain(|c| c != '\r');
|
input.retain(|c| c != '\r');
|
||||||
let lines = split_lines(&input);
|
let lines = split_lines(&input);
|
||||||
let (lines, is_forwarded) = skip_forward_header(&lines);
|
let (lines, is_forwarded) = skip_forward_header(&lines);
|
||||||
|
|
||||||
let (lines, mut top_quote) = remove_top_quote(lines);
|
let (lines, mut top_quote) = remove_top_quote(lines);
|
||||||
let original_lines = &lines;
|
let original_lines = &lines;
|
||||||
let lines = remove_message_footer(lines);
|
let (lines, footer_removed) = remove_message_footer(lines);
|
||||||
|
is_cut = is_cut || footer_removed;
|
||||||
|
|
||||||
let text = if is_chat_message {
|
let text = if is_chat_message {
|
||||||
render_message(lines, false)
|
render_message(lines, false)
|
||||||
@@ -88,6 +98,7 @@ pub fn simplify(mut input: String, is_chat_message: bool) -> (String, bool, Opti
|
|||||||
if lines.iter().all(|it| it.trim().is_empty()) {
|
if lines.iter().all(|it| it.trim().is_empty()) {
|
||||||
render_message(original_lines, false)
|
render_message(original_lines, false)
|
||||||
} else {
|
} else {
|
||||||
|
is_cut = is_cut || has_nonstandard_footer || bottom_quote.is_some();
|
||||||
render_message(lines, has_nonstandard_footer || bottom_quote.is_some())
|
render_message(lines, has_nonstandard_footer || bottom_quote.is_some())
|
||||||
}
|
}
|
||||||
};
|
};
|
||||||
@@ -95,11 +106,13 @@ pub fn simplify(mut input: String, is_chat_message: bool) -> (String, bool, Opti
|
|||||||
if !is_chat_message {
|
if !is_chat_message {
|
||||||
top_quote = top_quote.map(|quote| {
|
top_quote = top_quote.map(|quote| {
|
||||||
let quote_lines = split_lines("e);
|
let quote_lines = split_lines("e);
|
||||||
let quote_lines = remove_message_footer("e_lines);
|
let (quote_lines, footer_removed) = remove_message_footer("e_lines);
|
||||||
|
is_cut = is_cut || footer_removed;
|
||||||
|
|
||||||
render_message(quote_lines, false)
|
render_message(quote_lines, false)
|
||||||
});
|
});
|
||||||
}
|
}
|
||||||
(text, is_forwarded, top_quote)
|
(text, is_forwarded, is_cut, top_quote)
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Skips "forwarded message" header.
|
/// Skips "forwarded message" header.
|
||||||
@@ -254,7 +267,7 @@ mod tests {
|
|||||||
#[test]
|
#[test]
|
||||||
// proptest does not support [[:graphical:][:space:]] regex.
|
// proptest does not support [[:graphical:][:space:]] regex.
|
||||||
fn test_simplify_plain_text_fuzzy(input in "[!-~\t \n]+") {
|
fn test_simplify_plain_text_fuzzy(input in "[!-~\t \n]+") {
|
||||||
let (output, _is_forwarded, _) = simplify(input, true);
|
let (output, _is_forwarded, _, _) = simplify(input, true);
|
||||||
assert!(output.split('\n').all(|s| s != "-- "));
|
assert!(output.split('\n').all(|s| s != "-- "));
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
@@ -262,38 +275,42 @@ mod tests {
|
|||||||
#[test]
|
#[test]
|
||||||
fn test_dont_remove_whole_message() {
|
fn test_dont_remove_whole_message() {
|
||||||
let input = "\n------\nFailed\n------\n\nUh-oh, this workflow did not succeed!\n\nlots of other text".to_string();
|
let input = "\n------\nFailed\n------\n\nUh-oh, this workflow did not succeed!\n\nlots of other text".to_string();
|
||||||
let (plain, is_forwarded, _) = simplify(input, false);
|
let (plain, is_forwarded, is_cut, _) = simplify(input, false);
|
||||||
assert_eq!(
|
assert_eq!(
|
||||||
plain,
|
plain,
|
||||||
"------\nFailed\n------\n\nUh-oh, this workflow did not succeed!\n\nlots of other text"
|
"------\nFailed\n------\n\nUh-oh, this workflow did not succeed!\n\nlots of other text"
|
||||||
);
|
);
|
||||||
assert!(!is_forwarded);
|
assert!(!is_forwarded);
|
||||||
|
assert!(!is_cut);
|
||||||
}
|
}
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn test_chat_message() {
|
fn test_chat_message() {
|
||||||
let input = "Hi! How are you?\n\n---\n\nI am good.\n-- \nSent with my Delta Chat Messenger: https://delta.chat".to_string();
|
let input = "Hi! How are you?\n\n---\n\nI am good.\n-- \nSent with my Delta Chat Messenger: https://delta.chat".to_string();
|
||||||
let (plain, is_forwarded, _) = simplify(input, true);
|
let (plain, is_forwarded, is_cut, _) = simplify(input, true);
|
||||||
assert_eq!(plain, "Hi! How are you?\n\n---\n\nI am good.");
|
assert_eq!(plain, "Hi! How are you?\n\n---\n\nI am good.");
|
||||||
assert!(!is_forwarded);
|
assert!(!is_forwarded);
|
||||||
|
assert!(is_cut);
|
||||||
}
|
}
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn test_simplify_trim() {
|
fn test_simplify_trim() {
|
||||||
let input = "line1\n\r\r\rline2".to_string();
|
let input = "line1\n\r\r\rline2".to_string();
|
||||||
let (plain, is_forwarded, _) = simplify(input, false);
|
let (plain, is_forwarded, is_cut, _) = simplify(input, false);
|
||||||
|
|
||||||
assert_eq!(plain, "line1\nline2");
|
assert_eq!(plain, "line1\nline2");
|
||||||
assert!(!is_forwarded);
|
assert!(!is_forwarded);
|
||||||
|
assert!(!is_cut);
|
||||||
}
|
}
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn test_simplify_forwarded_message() {
|
fn test_simplify_forwarded_message() {
|
||||||
let input = "---------- Forwarded message ----------\r\nFrom: test@example.com\r\n\r\nForwarded message\r\n-- \r\nSignature goes here".to_string();
|
let input = "---------- Forwarded message ----------\r\nFrom: test@example.com\r\n\r\nForwarded message\r\n-- \r\nSignature goes here".to_string();
|
||||||
let (plain, is_forwarded, _) = simplify(input, false);
|
let (plain, is_forwarded, is_cut, _) = simplify(input, false);
|
||||||
|
|
||||||
assert_eq!(plain, "Forwarded message");
|
assert_eq!(plain, "Forwarded message");
|
||||||
assert!(is_forwarded);
|
assert!(is_forwarded);
|
||||||
|
assert!(is_cut);
|
||||||
}
|
}
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
@@ -335,41 +352,50 @@ mod tests {
|
|||||||
#[test]
|
#[test]
|
||||||
fn test_remove_message_footer() {
|
fn test_remove_message_footer() {
|
||||||
let input = "text\n--\nno footer".to_string();
|
let input = "text\n--\nno footer".to_string();
|
||||||
let (plain, _, _) = simplify(input, true);
|
let (plain, _, is_cut, _) = simplify(input, true);
|
||||||
assert_eq!(plain, "text\n--\nno footer");
|
assert_eq!(plain, "text\n--\nno footer");
|
||||||
|
assert!(!is_cut);
|
||||||
|
|
||||||
let input = "text\n\n--\n\nno footer".to_string();
|
let input = "text\n\n--\n\nno footer".to_string();
|
||||||
let (plain, _, _) = simplify(input, true);
|
let (plain, _, is_cut, _) = simplify(input, true);
|
||||||
assert_eq!(plain, "text\n\n--\n\nno footer");
|
assert_eq!(plain, "text\n\n--\n\nno footer");
|
||||||
|
assert!(!is_cut);
|
||||||
|
|
||||||
let input = "text\n\n-- no footer\n\n".to_string();
|
let input = "text\n\n-- no footer\n\n".to_string();
|
||||||
let (plain, _, _) = simplify(input, true);
|
let (plain, _, _, _) = simplify(input, true);
|
||||||
assert_eq!(plain, "text\n\n-- no footer");
|
assert_eq!(plain, "text\n\n-- no footer");
|
||||||
|
|
||||||
let input = "text\n\n--\nno footer\n-- \nfooter".to_string();
|
let input = "text\n\n--\nno footer\n-- \nfooter".to_string();
|
||||||
let (plain, _, _) = simplify(input, true);
|
let (plain, _, is_cut, _) = simplify(input, true);
|
||||||
assert_eq!(plain, "text\n\n--\nno footer");
|
assert_eq!(plain, "text\n\n--\nno footer");
|
||||||
|
assert!(is_cut);
|
||||||
|
|
||||||
let input = "text\n\n--\ntreated as footer when unescaped".to_string();
|
let input = "text\n\n--\ntreated as footer when unescaped".to_string();
|
||||||
let (plain, _, _) = simplify(input.clone(), true);
|
let (plain, _, is_cut, _) = simplify(input.clone(), true);
|
||||||
assert_eq!(plain, "text"); // see remove_message_footer() for some explanations
|
assert_eq!(plain, "text"); // see remove_message_footer() for some explanations
|
||||||
|
assert!(is_cut);
|
||||||
let escaped = escape_message_footer_marks(&input);
|
let escaped = escape_message_footer_marks(&input);
|
||||||
let (plain, _, _) = simplify(escaped, true);
|
let (plain, _, is_cut, _) = simplify(escaped, true);
|
||||||
assert_eq!(plain, "text\n\n--\ntreated as footer when unescaped");
|
assert_eq!(plain, "text\n\n--\ntreated as footer when unescaped");
|
||||||
|
assert!(!is_cut);
|
||||||
|
|
||||||
// Nonstandard footer sent by https://siju.es/
|
// Nonstandard footer sent by https://siju.es/
|
||||||
let input = "Message text here\n---Desde mi teléfono con SIJÚ\n\nQuote here".to_string();
|
let input = "Message text here\n---Desde mi teléfono con SIJÚ\n\nQuote here".to_string();
|
||||||
let (plain, _, _) = simplify(input.clone(), false);
|
let (plain, _, is_cut, _) = simplify(input.clone(), false);
|
||||||
assert_eq!(plain, "Message text here [...]");
|
assert_eq!(plain, "Message text here [...]");
|
||||||
let (plain, _, _) = simplify(input.clone(), true);
|
assert!(is_cut);
|
||||||
|
let (plain, _, is_cut, _) = simplify(input.clone(), true);
|
||||||
assert_eq!(plain, input);
|
assert_eq!(plain, input);
|
||||||
|
assert!(!is_cut);
|
||||||
|
|
||||||
let input = "--\ntreated as footer when unescaped".to_string();
|
let input = "--\ntreated as footer when unescaped".to_string();
|
||||||
let (plain, _, _) = simplify(input.clone(), true);
|
let (plain, _, is_cut, _) = simplify(input.clone(), true);
|
||||||
assert_eq!(plain, ""); // see remove_message_footer() for some explanations
|
assert_eq!(plain, ""); // see remove_message_footer() for some explanations
|
||||||
|
assert!(is_cut);
|
||||||
|
|
||||||
let escaped = escape_message_footer_marks(&input);
|
let escaped = escape_message_footer_marks(&input);
|
||||||
let (plain, _, _) = simplify(escaped, true);
|
let (plain, _, is_cut, _) = simplify(escaped, true);
|
||||||
assert_eq!(plain, "--\ntreated as footer when unescaped");
|
assert_eq!(plain, "--\ntreated as footer when unescaped");
|
||||||
|
assert!(!is_cut);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|||||||
52
src/sql.rs
52
src/sql.rs
@@ -224,6 +224,31 @@ impl Sql {
|
|||||||
.await
|
.await
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/// Check if a column exists in a given table.
|
||||||
|
pub async fn col_exists(
|
||||||
|
&self,
|
||||||
|
table_name: impl AsRef<str>,
|
||||||
|
col_name: impl AsRef<str>,
|
||||||
|
) -> Result<bool> {
|
||||||
|
let table_name = table_name.as_ref().to_string();
|
||||||
|
let col_name = col_name.as_ref().to_string();
|
||||||
|
self.with_conn(move |conn| {
|
||||||
|
let mut exists = false;
|
||||||
|
// `PRAGMA table_info` returns one row per column,
|
||||||
|
// each row containing 0=cid, 1=name, 2=type, 3=notnull, 4=dflt_value
|
||||||
|
conn.pragma(None, "table_info", &table_name, |row| {
|
||||||
|
let curr_name: String = row.get(1)?;
|
||||||
|
if col_name == curr_name {
|
||||||
|
exists = true;
|
||||||
|
}
|
||||||
|
Ok(())
|
||||||
|
})?;
|
||||||
|
|
||||||
|
Ok(exists)
|
||||||
|
})
|
||||||
|
.await
|
||||||
|
}
|
||||||
|
|
||||||
/// Execute a query which is expected to return zero or one row.
|
/// Execute a query which is expected to return zero or one row.
|
||||||
pub async fn query_row_optional<T, F>(
|
pub async fn query_row_optional<T, F>(
|
||||||
&self,
|
&self,
|
||||||
@@ -1413,6 +1438,17 @@ CREATE INDEX devmsglabels_index1 ON devmsglabels (label);
|
|||||||
|
|
||||||
sql.set_raw_config_int(context, "dbversion", 71).await?;
|
sql.set_raw_config_int(context, "dbversion", 71).await?;
|
||||||
}
|
}
|
||||||
|
if dbversion < 72 {
|
||||||
|
info!(context, "[migration] v72");
|
||||||
|
if !sql.col_exists("msgs", "mime_modified").await? {
|
||||||
|
sql.execute(
|
||||||
|
"ALTER TABLE msgs ADD COLUMN mime_modified INTEGER DEFAULT 0;",
|
||||||
|
paramsv![],
|
||||||
|
)
|
||||||
|
.await?;
|
||||||
|
}
|
||||||
|
sql.set_raw_config_int(context, "dbversion", 72).await?;
|
||||||
|
}
|
||||||
|
|
||||||
// (2) updates that require high-level objects
|
// (2) updates that require high-level objects
|
||||||
// (the structure is complete now and all objects are usable)
|
// (the structure is complete now and all objects are usable)
|
||||||
@@ -1468,6 +1504,7 @@ async fn prune_tombstones(context: &Context) -> Result<()> {
|
|||||||
#[cfg(test)]
|
#[cfg(test)]
|
||||||
mod test {
|
mod test {
|
||||||
use super::*;
|
use super::*;
|
||||||
|
use crate::test_utils::TestContext;
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn test_maybe_add_file() {
|
fn test_maybe_add_file() {
|
||||||
@@ -1494,4 +1531,19 @@ mod test {
|
|||||||
assert!(!is_file_in_use(&files, Some(".txt"), "hello"));
|
assert!(!is_file_in_use(&files, Some(".txt"), "hello"));
|
||||||
assert!(is_file_in_use(&files, Some("-suffix"), "world.txt-suffix"));
|
assert!(is_file_in_use(&files, Some("-suffix"), "world.txt-suffix"));
|
||||||
}
|
}
|
||||||
|
|
||||||
|
#[async_std::test]
|
||||||
|
async fn test_table_exists() {
|
||||||
|
let t = TestContext::new().await;
|
||||||
|
assert!(t.ctx.sql.table_exists("msgs").await.unwrap());
|
||||||
|
assert!(!t.ctx.sql.table_exists("foobar").await.unwrap());
|
||||||
|
}
|
||||||
|
|
||||||
|
#[async_std::test]
|
||||||
|
async fn test_col_exists() {
|
||||||
|
let t = TestContext::new().await;
|
||||||
|
assert!(t.ctx.sql.col_exists("msgs", "mime_modified").await.unwrap());
|
||||||
|
assert!(!t.ctx.sql.col_exists("msgs", "foobar").await.unwrap());
|
||||||
|
assert!(!t.ctx.sql.col_exists("foobar", "foobar").await.unwrap());
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|||||||
53
test-data/message/apple_cid_jpg.eml
Normal file
53
test-data/message/apple_cid_jpg.eml
Normal file
@@ -0,0 +1,53 @@
|
|||||||
|
From: =?utf-8?Q?Bj=C3=B6rn_Petersen?= <somewhere-apple@me.com>
|
||||||
|
Content-Type: multipart/alternative;
|
||||||
|
boundary="Apple-Mail=_19251BCB-E12B-423A-9553-5A68560C2AFD"
|
||||||
|
Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.4\))
|
||||||
|
Subject: a jpeg
|
||||||
|
Message-Id: <BC47DA72-6C78-443A-8EBF-2CD199ABAD09@me.com>
|
||||||
|
Date: Sat, 9 Jan 2021 00:36:11 +0100
|
||||||
|
To: somewhere-nonapple@testrun.org
|
||||||
|
X-Mailer: Apple Mail (2.3608.120.23.2.4)
|
||||||
|
|
||||||
|
|
||||||
|
--Apple-Mail=_19251BCB-E12B-423A-9553-5A68560C2AFD
|
||||||
|
Content-Transfer-Encoding: 7bit
|
||||||
|
Content-Type: text/plain;
|
||||||
|
charset=us-ascii
|
||||||
|
|
||||||
|
|
||||||
|
a jpeg
|
||||||
|
--Apple-Mail=_19251BCB-E12B-423A-9553-5A68560C2AFD
|
||||||
|
Content-Type: multipart/related;
|
||||||
|
type="text/html";
|
||||||
|
boundary="Apple-Mail=_4C3710FD-D75D-47FB-8D41-983220390856"
|
||||||
|
|
||||||
|
|
||||||
|
--Apple-Mail=_4C3710FD-D75D-47FB-8D41-983220390856
|
||||||
|
Content-Transfer-Encoding: 7bit
|
||||||
|
Content-Type: text/html;
|
||||||
|
charset=us-ascii
|
||||||
|
|
||||||
|
<html><head><meta http-equiv="Content-Type" content="text/html; charset=us-ascii"><base></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class=""><base class=""><div class="Apple-Mail-URLShareUserContentTopClass"><br class=""></div><div class="Apple-Mail-URLShareWrapperClass"><blockquote type="cite" style="border-left-style: none; color: inherit; padding: inherit; margin: inherit;" class="">a jpeg
|
||||||
|
<img apple-inline="yes" id="118F6150-5EF5-4DE8-917F-1851EC94FB7C" src="cid:8AE052EF-BC90-486F-BB78-58D3590308EC@fritz.box" class=""></blockquote></div></body></html>
|
||||||
|
--Apple-Mail=_4C3710FD-D75D-47FB-8D41-983220390856
|
||||||
|
Content-Transfer-Encoding: base64
|
||||||
|
Content-Disposition: inline;
|
||||||
|
filename=small.jpg
|
||||||
|
Content-Type: image/jpeg;
|
||||||
|
x-unix-mode=0666;
|
||||||
|
name="small.jpg"
|
||||||
|
Content-Id: <8AE052EF-BC90-486F-BB78-58D3590308EC@fritz.box>
|
||||||
|
|
||||||
|
/9j/4AAQSkZJRgABAQEBLAEsAAD/2wBDABELDA8MChEPDg8TEhEUGSobGRcXGTMkJh4qPDU/Pjs1
|
||||||
|
OjlDS2BRQ0daSDk6U3FUWmNma2xrQFB2fnRofWBpa2f/2wBDARITExkWGTEbGzFnRTpFZ2dnZ2dn
|
||||||
|
Z2dnZ2dnZ2dnZ2dnZ2dnZ2dnZ2dnZ2dnZ2dnZ2dnZ2dnZ2dnZ2dnZ2dnZ2f/wgARCAAIAAgDAREA
|
||||||
|
AhEBAxEB/8QAFAABAAAAAAAAAAAAAAAAAAAABP/EABUBAQEAAAAAAAAAAAAAAAAAAAMF/9oADAMB
|
||||||
|
AAIQAxAAAAF8s//EABQQAQAAAAAAAAAAAAAAAAAAAAD/2gAIAQEAAQUCf//EABQRAQAAAAAAAAAA
|
||||||
|
AAAAAAAAAAD/2gAIAQMBAT8Bf//EABQRAQAAAAAAAAAAAAAAAAAAAAD/2gAIAQIBAT8Bf//EABQQ
|
||||||
|
AQAAAAAAAAAAAAAAAAAAAAD/2gAIAQEABj8Cf//EABQQAQAAAAAAAAAAAAAAAAAAAAD/2gAIAQEA
|
||||||
|
AT8hf//aAAwDAQACAAMAAAAQ3//EABQRAQAAAAAAAAAAAAAAAAAAAAD/2gAIAQMBAT8Qf//EABQR
|
||||||
|
AQAAAAAAAAAAAAAAAAAAAAD/2gAIAQIBAT8Qf//EABQQAQAAAAAAAAAAAAAAAAAAAAD/2gAIAQEA
|
||||||
|
AT8Qf//Z
|
||||||
|
--Apple-Mail=_4C3710FD-D75D-47FB-8D41-983220390856--
|
||||||
|
|
||||||
|
--Apple-Mail=_19251BCB-E12B-423A-9553-5A68560C2AFD--
|
||||||
16
test-data/message/text_alt_html.eml
Normal file
16
test-data/message/text_alt_html.eml
Normal file
@@ -0,0 +1,16 @@
|
|||||||
|
Subject: mime-modified test
|
||||||
|
Message-ID: 12345@testrun.org
|
||||||
|
Date: Sat, 07 Dec 2019 19:00:27 +0000
|
||||||
|
To: recp@testrun.org
|
||||||
|
From: sender@testrun.org
|
||||||
|
Content-Type: multipart/alternative; boundary="==BREAK=="
|
||||||
|
|
||||||
|
|
||||||
|
--==BREAK==
|
||||||
|
Content-Type: text/html; charset=utf-8
|
||||||
|
|
||||||
|
<html>
|
||||||
|
<p>mime-modified <b>set</b>; simplify is always regarded as lossy.</p>
|
||||||
|
</html>
|
||||||
|
|
||||||
|
--==BREAK==--
|
||||||
16
test-data/message/text_alt_plain.eml
Normal file
16
test-data/message/text_alt_plain.eml
Normal file
@@ -0,0 +1,16 @@
|
|||||||
|
Subject: mime-modified test
|
||||||
|
Message-ID: 12345@testrun.org
|
||||||
|
Date: Sat, 07 Dec 2019 19:00:27 +0000
|
||||||
|
To: recp@testrun.org
|
||||||
|
From: sender@testrun.org
|
||||||
|
Content-Type: multipart/alternative; boundary="==BREAK=="
|
||||||
|
|
||||||
|
|
||||||
|
--==BREAK==
|
||||||
|
Content-Type: text/plain; charset=utf-8
|
||||||
|
|
||||||
|
mime-modified should not be set set as there is no html and no special stuff;
|
||||||
|
although not being a delta-message.
|
||||||
|
test some special html-characters as < > and & but also " and ' :)
|
||||||
|
|
||||||
|
--==BREAK==--
|
||||||
23
test-data/message/text_alt_plain_html.eml
Normal file
23
test-data/message/text_alt_plain_html.eml
Normal file
@@ -0,0 +1,23 @@
|
|||||||
|
Subject: mime-modified test
|
||||||
|
Message-ID: 12345@testrun.org
|
||||||
|
Date: Sat, 07 Dec 2019 19:00:27 +0000
|
||||||
|
To: recp@testrun.org
|
||||||
|
From: sender@testrun.org
|
||||||
|
Content-Type: multipart/alternative; boundary="==BREAK=="
|
||||||
|
|
||||||
|
|
||||||
|
--==BREAK==
|
||||||
|
Content-Type: text/plain; charset=utf-8
|
||||||
|
|
||||||
|
this is plain
|
||||||
|
|
||||||
|
--==BREAK==
|
||||||
|
Content-Type: text/html; charset=utf-8
|
||||||
|
|
||||||
|
<html>
|
||||||
|
<p>
|
||||||
|
this is <b>html</b>
|
||||||
|
</p>
|
||||||
|
</html>
|
||||||
|
|
||||||
|
--==BREAK==--
|
||||||
11
test-data/message/text_html.eml
Normal file
11
test-data/message/text_html.eml
Normal file
@@ -0,0 +1,11 @@
|
|||||||
|
Subject: mime-modified test
|
||||||
|
Message-ID: 12345@testrun.org
|
||||||
|
Date: Sat, 07 Dec 2019 19:00:27 +0000
|
||||||
|
To: recp@testrun.org
|
||||||
|
From: sender@testrun.org
|
||||||
|
Content-Type: text/html; charset=utf-8
|
||||||
|
|
||||||
|
|
||||||
|
<html>
|
||||||
|
<p>mime-modified <b>set</b>; simplify is always regarded as lossy.</p>
|
||||||
|
</html>
|
||||||
11
test-data/message/text_plain_flowed.eml
Normal file
11
test-data/message/text_plain_flowed.eml
Normal file
@@ -0,0 +1,11 @@
|
|||||||
|
Message-Id: <lkjsdf01u@example.org>
|
||||||
|
Date: Sat, 14 Sep 2019 19:00:13 +0200
|
||||||
|
From: lmn <x@tux.org>
|
||||||
|
To: abc <abc@bcd.com>
|
||||||
|
Content-Type: text/plain; charset=utf-8; format=flowed
|
||||||
|
|
||||||
|
This line ends with a space
|
||||||
|
and will be merged with the next one due to format=flowed.
|
||||||
|
|
||||||
|
This line does not end with a space
|
||||||
|
and will be wrapped as usual.
|
||||||
7
test-data/message/text_plain_iso88591.eml
Normal file
7
test-data/message/text_plain_iso88591.eml
Normal file
@@ -0,0 +1,7 @@
|
|||||||
|
Message-Id: <lkjsdf01u@example.org>
|
||||||
|
Date: Sat, 14 Sep 2019 19:00:13 +0200
|
||||||
|
From: lmn <x@tux.org>
|
||||||
|
To: abc <abc@bcd.com>
|
||||||
|
Content-Type: text/plain; charset=iso-8859-1
|
||||||
|
|
||||||
|
message with a non-UTF-8 encoding: <20><><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>
|
||||||
6
test-data/message/text_plain_unspecified.eml
Normal file
6
test-data/message/text_plain_unspecified.eml
Normal file
@@ -0,0 +1,6 @@
|
|||||||
|
Message-Id: <lkjsdf01u@example.org>
|
||||||
|
Date: Sat, 14 Sep 2019 19:00:13 +0200
|
||||||
|
From: lmn <x@tux.org>
|
||||||
|
To: abc <abc@bcd.com>
|
||||||
|
|
||||||
|
This message does not have Content-Type nor Subject.
|
||||||
Reference in New Issue
Block a user