linux - List all files a process accesses in an efficient manner -


i log file accesses process makes during it's lifetime in efficient manner.

currently, doing using ld_preload preloading shared library intercepts c library calls deal file accesses. method efficient without performance overhead, not leak proof.

for instance, ld_preload shared library have has hook dlopen. hook used track accesses shared libraries, mechanism fails log tertiary dependencies of shared library.

we did try using strace performance overhead of using strace non-starter us. curious if have other mechanisms can explore intercept file accesses process , it's sub-processes makes in efficient manner. open exploring options @ kernel level, hooks vfs layer or else.

thoughts?

we did try using strace performance overhead of using strace non-starter us.

strace slow, uses ancient , slow ptrace syscall debugger application. every syscall made application converted signal strace, around 2 ptrace syscalls strace (also printing, access other process memory string/struct values) , continuing target application (2 context switches). strace supports syscall filters, filter can't registered ptrace, , strace filtering in user-space, tracing syscalls.

there faster kernel-based solutions, brendan gregg (author of dtrace book - solaris, osx, freebsd) have many overviews of tracing tools (in blog: tracing 15 minutes, bpf superpowers, 60s of linux perf, choosing tracer 2015 (with magic pony), page cache stats), example

brendan gregg - linux kernel analysis , tools

you interested in left part of diagram, near vfs block. perf (standard tool), dtrace (supported in linuxes, have license problems - cddl incompatible gpl), stap (systemtap, works better red linuxes centos).

there direct replacement of strace - sysdig tool (requires additional kernel module, github) works system calls tcpdump works network interface sniffing. tool sniffs syscalls inside kernel without additional context switches or signals or poking other process memory ptrace (kernel has strings copied user) , uses smart buffering dump traces userspace tool in huge packets.

there other universal tracing frameworks/tools lttng (out of tree), ftrace / trace-cmd. , bcc ebpf powerful framework included in modern (4.9+) linux kernels (check http://www.brendangregg.com/slides/scale2017_perf_analysis_ebpf.pdf). bcc , ebpf allow write small (ans safe) code fragments data aggregation in-kernel near tracepoint:

brendan gregg list of bcc tools around linux kernel subsystems

try brendan's tools near vfs if linux kernel recent enough: opensnoop, statsnoop, syncsnoop; file* tools (tools support pid filtering -p pid or may work system-wide). described partially @ http://www.brendangregg.com/dtrace.html , published on github: https://github.com/brendangregg/perf-tools (also https://github.com/iovisor/bcc#tools)

as of linux 4.9, linux kernel has similar raw capabilities dtrace. ...

opensnoop program snoop file opens. filename , file handle traced along process details.

# opensnoop -g   uid   pid path                                   fd args   100  3528 /var/ld/ld.config                      -1 cat /etc/passwd   100  3528 /usr/lib/libc.so.1                      3 cat /etc/passwd   100  3528 /etc/passwd                             3 cat /etc/passwd      100  3529 /var/ld/ld.config                      -1 cal   100  3529 /usr/lib/libc.so.1                      3 cal 

rwsnoop snoop read/write events. measuring reads , writes @ application level - syscalls.

# rwsnoop   uid    pid cmd          d   bytes file     0   2924 sh           r     128 /etc/profile     0   2924 sh           r     128 /etc/profile     0   2924 sh           r     128 /etc/profile     0   2924 sh           r      84 /etc/profile     0   2925 quota        r     757 /etc/nsswitch.conf     0   2925 quota        r       0 /etc/nsswitch.conf     0   2925 quota        r     668 /etc/passwd 

Comments