Skip to content

feat(logging): add Logger interface and default logger (2/6)#723

Open
kamcheungting-db wants to merge 1 commit into
apache:mainfrom
kamcheungting-db:logging-block2-interface
Open

feat(logging): add Logger interface and default logger (2/6)#723
kamcheungting-db wants to merge 1 commit into
apache:mainfrom
kamcheungting-db:logging-block2-interface

Conversation

@kamcheungting-db

@kamcheungting-db kamcheungting-db commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Part 2 of the logging stack (builds on #722). Adds the logging API and a swappable default logger — the foundation the backends and macros plug into.

What's here

  • Logger: the pluggable sink interface (ShouldLog / Log / SetLevel / Flush / Initialize). ShouldLog() is the single source of truth for runtime filtering.
  • LogMessage owns its formatted text so a sink can safely keep a record; reserves an attributes field for future structured logging.
  • Process-global default logger: GetDefaultLogger / SetDefaultLogger / SetDefaultLevel, with a lock-free thread-local fast path so logging stays cheap.
  • Initialize applies the level property, so config-driven levels actually work. CurrentLogger() is safe to call even from a thread_local destructor during thread shutdown.
  • logger.h stays backend-agnostic (never includes the build config header), so consumers see one stable API regardless of backend.

Examples — using the API directly (the LOG_* macros that wrap it arrive in #725):

// A custom sink, installed as the process default.
class MySink : public Logger {
 public:
  bool ShouldLog(LogLevel level) const override { return level >= level_; }
  void Log(LogMessage&& m) noexcept override { write_line(m.message); }
  void SetLevel(LogLevel level) override { level_ = level; }
  LogLevel level() const override { return level_; }
 private:
  std::atomic<LogLevel> level_{LogLevel::kInfo};
};

SetDefaultLogger(std::make_shared<MySink>());   // install process-wide
SetDefaultLevel(LogLevel::kDebug);              // adjust the threshold

auto logger = GetDefaultLogger();               // borrow the current default
if (logger->ShouldLog(LogLevel::kInfo)) {
  logger->Log(LogMessage{.level = LogLevel::kInfo, .message = "scan ready"});
}

// Or configure from catalog-style properties (applies the "level" key):
auto sink = std::make_shared<MySink>();
auto status = sink->Initialize({{std::string(kLevelProperty), "warn"}});  // -> kWarn

The same example is documented inline in logger.h.

Testslogger_test: default-logger API, level-from-property, invalid level rejected, concurrent swap/read, and logging during thread teardown. Built and run with clang/libc++ (spdlog ON and OFF).

This pull request and its description were written by Isaac.

@kamcheungting-db kamcheungting-db changed the title [Iceberg Logger] [Part-2] Logger Interface feat: [Iceberg Logger] [Part-2] Logger Interface Jun 10, 2026
@kamcheungting-db kamcheungting-db changed the title feat: [Iceberg Logger] [Part-2] Logger Interface feat(logging): add Logger interface and default logger Jun 11, 2026
@kamcheungting-db kamcheungting-db changed the title feat(logging): add Logger interface and default logger feat(logging): add Logger interface and default logger (2/6) Jun 11, 2026
@kamcheungting-db kamcheungting-db force-pushed the logging-block2-interface branch 7 times, most recently from fd8c3b4 to c2b6a92 Compare June 16, 2026 03:27
@kamcheungting-db kamcheungting-db force-pushed the logging-block2-interface branch from c2b6a92 to cf09b69 Compare June 16, 2026 03:40
Second block: the backend-agnostic logging API and the swappable default logger.

- Logger: pure-virtual sink (ShouldLog/Log/SetLevel/level/Flush/IsNoop), with
  ShouldLog() as the single authority for runtime filtering and a documented
  no-reentrancy / thread-safety / noexcept contract. Mirrors MetricsReporter.
- LogMessage owns its formatted text (moved in) so a sink may retain records;
  reserves an owning attributes vector for future structured logging without an
  ABI break. logger.h is backend-agnostic -- it never includes the build config
  header nor references the spdlog feature macro, so consumers see one stable API.
- Process-global default logger: GetDefaultLogger / SetDefaultLogger /
  SetDefaultLevel, plus a lock-free thread-local generation cache on the hot path
  (no lock or refcount churn in steady state; slot mutex only on swap/refresh).
  NoopLogger is the placeholder default here; CerrLogger and the spdlog backend
  arrive in the following blocks.

Builds the source into both CMake and Meson and installs logger.h alongside
log_level.h; logger_test covers the default-logger API, level lowering taking
effect immediately, and concurrent swap/read safety.

Co-authored-by: Isaac
@kamcheungting-db kamcheungting-db force-pushed the logging-block2-interface branch from cf09b69 to 246cab6 Compare June 16, 2026 03:58
Comment on lines +103 to +104
(void)l->ShouldLog(LogLevel::kError);
(void)GetDefaultLogger();

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
(void)l->ShouldLog(LogLevel::kError);
(void)GetDefaultLogger();
std::ignore = l->ShouldLog(LogLevel::kError);
std::ignore = GetDefaultLogger();

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

auto status =
logger.Initialize({{std::string(kLevelProperty), std::string("not-a-level")}});
ASSERT_FALSE(status.has_value());
EXPECT_EQ(status.error().kind, ErrorKind::kInvalidArgument);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use IsError matcher.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

/// Both key and value are owned so a sink may retain the record safely.
/// Unused in v1; reserved so structured logging can be added without an ABI
/// break to LogMessage.
struct LogAttribute {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to add ICEBERG_EXPORT to this and below?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


namespace detail {

const std::shared_ptr<Logger>& CurrentLogger() noexcept {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is still not safe for thread_local teardown.

One case: a thread creates another thread_local object first. Then it calls CurrentLogger() in normal code. At thread exit, alive is destroyed before that earlier object. If that earlier object logs in its destructor, we read alive.value after alive is already destroyed. This is UB.

The test only checks first use from a TLS destructor. Maybe we need a design without destructible TLS state, or leak the per-thread cache.

}

/// \brief Cheap check whether a record at \p level would be emitted.
virtual bool ShouldLog(LogLevel level) const = 0;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The doc says logger implementations must not throw. ShouldLog() is also in the hot path.

But the virtual method is not noexcept. A custom engine logger may throw here by mistake. Then logging can throw from the filter path. Adding noexcept later is also an API/ABI change.

Can we mark ShouldLog() noexcept now? Maybe SetLevel() and level() too, if we want the same rule for all logger methods.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

#pragma once

/// \file iceberg/logging/logger.h
/// \brief Pluggable logging interface, the process-global default logger, and

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small doc issue. This header says it has logging macros, but this PR does not add them.

The PR text says macros are in a later block. Users who read the installed header may be confused.

Maybe say this header has the logger API and macro helper hooks. Or move this sentence to the later PR.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

///
/// Off the hot path -- acquires the slot lock and returns an owning copy. The
/// logging macros use the cheaper internal hot-path accessor instead.
ICEBERG_EXPORT std::shared_ptr<Logger> GetDefaultLogger();

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think only having a process-global default logger may be hard for engine integration.

In engines, users may want different loggers for different catalog, session, query, or task. If all code only reads one global logger, it is hard to route logs to the engine logging context.

Can we consider a way to pass or bind a logger from a higher level context? The global logger can still be the fallback.

///
/// Declared ICEBERG_EXPORT because the logging macros expand into this call in
/// consumer translation units.
ICEBERG_EXPORT void Emit(Logger& logger, LogLevel level,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should keep a lazy format-style API here.

Now Emit takes an already formatted std::string. This means the string must be built before calling the sink. It is OK after ShouldLog(), but it still makes the API string-based. Some engine loggers may want to keep format string + args, or format only inside their own backend.

Can we provide an API closer to std::format style, like log(level, fmt, args...), and only build the string after the level check? This also makes normal use easier for callers.

LogLevel level = LogLevel::kOff;
std::string message;
std::source_location location = std::source_location::current();
std::vector<LogAttribute> attributes;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the attributes field is useful, but now there is no easy API to fill it.

For engine loggers, structured fields are often important. For example query id, task id, table name, snapshot id, file path, etc. If the first public API only supports message text in practice, it may be harder to add good structured logging later.

Can we add a small helper or builder path for attributes, or at least define how future macros will pass them?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants