perf(dashboard): avoid full note reads when building excerpts by joshtrichards · Pull Request #1886 · nextcloud/notes

joshtrichards · 2026-06-04T14:47:43Z

Summary

Optimize note excerpt generation by reading only a small prefix of the file instead of loading the full note content.

Changes

add a lightweight best-effort excerpt source reader by switching from the File API's getContent to own fopen/fread
read only a bounded prefix sized for excerpt generation
trim incomplete trailing UTF-8 bytes with mb_strcut()
preserve existing excerpt formatting behavior after markdown stripping
return an empty excerpt if the lightweight preview read fails, instead of falling back to a full file read
additional optimization: use pre-compiled string literal for BOM stripping eliminating need for pack() at runtime
also fixes a small bug: $title is typed as string and empty() ends up being true on things like "0".

Why

Excerpt generation only needs roughly the first 100 visible characters, so reading the entire note is unnecessary work. This change reduces I/O and memory usage for note list rendering, especially for larger notes or slower storage backends.

Notes

this is intentionally a performance-oriented heuristic, not a full-fidelity content read
markdown-heavy prefixes may occasionally produce shorter previews than before
full note reads remain unchanged via getContent()

joshtrichards · 2026-06-19T21:58:51Z

Lint php-cs failures are unrelated - see #1905

Signed-off-by: Josh <josh.t.richards@gmail.com>

(not using ISimpleFile oops) Signed-off-by: Josh <josh.t.richards@gmail.com>

enjeck

some comments:

enjeck · 2026-06-22T06:28:46Z

+		$excerpt = $this->noteUtil->stripMarkdown($this->getExcerptContent($maxlen));
+


since this no longer goes through getContent(), it loses the non-UTF-8 handling done there. A UTF-16-encoded note that produced a readable excerpt before will now be read as raw bytes and decoded as UTF-8 below, yielding garbage?

enjeck · 2026-06-22T06:33:38Z

+		// Over-read bytes assuming worst-case UTF-8 size (up to 4 bytes per
+		// character). This is only a heuristic for preview generation; markdown
+		// stripping may reduce the visible character count further.
+		$bytesToRead = max(512, $maxlen * 4);


Maybe * 6 is better. With the default maxlen=100 this reads 512 bytes. After stripMarkdown() and the leading-title strip (lines 71–76), a long first line / URL / long title can push the visible excerpt below maxlen, where the old full-read produced a complete one

enjeck · 2026-06-22T06:37:50Z

+		// Remove any partial trailing multibyte character from the truncated read.
+		$content = mb_strcut($content, 0, strlen($content), 'UTF-8');


I dont understand. The comment says this removes a partial trailing multibyte char, but passing strlen($content) (the full byte length) as the cut length makes it effectively a no-op for that purpos?

enjeck · 2026-06-22T06:40:03Z

+		// Strip Byte Order Marks (BOM) for UTF-8, UTF-16 BE, and UTF-16 LE
+		$content = str_replace(["\xEF\xBB\xBF", "\xFE\xFF", "\xFF\xFE"], '', $content);


The UTF-16 BOMs (\xFE\xFF, \xFF\xFE) are stripped here, but since the body isn't transcoded from UTF-16 (a i said at https://github.com/nextcloud/notes/pull/1886/changes#r3450190129) , the rest of a UTF-16 note is still mis-decoded?

joshtrichards requested review from enjeck and silverkszlo as code owners June 4, 2026 14:47

joshtrichards added bug Something isn't working enhancement New feature or request feature: dashboard Related to Nextcloud dashboard 3. to review performance 🚀 labels Jun 4, 2026

joshtrichards changed the title ~~perf(dashboard): use lightweight streamed reads for note excerpts~~ perf(dashboard): avoid full note reads when building excerpts Jun 19, 2026

joshtrichards added 3 commits June 22, 2026 07:05

perf(note): stream excerpt instead of reading full note

db6ceec

Signed-off-by: Josh <josh.t.richards@gmail.com>

perf: use pre-compiled string literal for BOM stripping

957ec08

Signed-off-by: Josh <josh.t.richards@gmail.com>

chore(Note): fixup

9a9f8b1

(not using ISimpleFile oops) Signed-off-by: Josh <josh.t.richards@gmail.com>

enjeck force-pushed the jtr/perf-dashboard-excerpt branch from 93364d6 to 9a9f8b1 Compare June 22, 2026 06:05

enjeck reviewed Jun 22, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(dashboard): avoid full note reads when building excerpts#1886

perf(dashboard): avoid full note reads when building excerpts#1886
joshtrichards wants to merge 3 commits into
mainfrom
jtr/perf-dashboard-excerpt

joshtrichards commented Jun 4, 2026 •

edited

Loading

Uh oh!

joshtrichards commented Jun 19, 2026

Uh oh!

enjeck left a comment

Uh oh!

enjeck Jun 22, 2026

Uh oh!

enjeck Jun 22, 2026

Uh oh!

enjeck Jun 22, 2026

Uh oh!

enjeck Jun 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		$excerpt = $this->noteUtil->stripMarkdown($this->getExcerptContent($maxlen));

		// Remove any partial trailing multibyte character from the truncated read.
		$content = mb_strcut($content, 0, strlen($content), 'UTF-8');

		// Strip Byte Order Marks (BOM) for UTF-8, UTF-16 BE, and UTF-16 LE
		$content = str_replace(["\xEF\xBB\xBF", "\xFE\xFF", "\xFF\xFE"], '', $content);

Conversation

joshtrichards commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Why

Notes

Uh oh!

joshtrichards commented Jun 19, 2026

Uh oh!

enjeck left a comment

Choose a reason for hiding this comment

Uh oh!

enjeck Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

enjeck Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

enjeck Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

enjeck Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

joshtrichards commented Jun 4, 2026 •

edited

Loading