Skip to content

Commit

Permalink
new(libsinsp): add concatenated process lineage fields as filter / di…
Browse files Browse the repository at this point in the history
…splay fields

* Less loops and more efficient if you want to export all process ancestors up to a certain level anyways
* More convenient and intuitive display of information -> security analysts prefer an output like this /bin/java->/bin/bash->/bin/python->/bin/bash to quickly understand the process origins
* Enhanced threat detection capabilities -> can string match exact lineage / sequence
* Needed in the planned anomaly detection framework

Signed-off-by: Melissa Kilby <[email protected]>
  • Loading branch information
incertum committed Jan 17, 2024
1 parent 48277b1 commit 7dd75f2
Show file tree
Hide file tree
Showing 3 changed files with 178 additions and 1 deletion.
167 changes: 166 additions & 1 deletion userspace/libsinsp/sinsp_filtercheck_thread.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -49,12 +49,15 @@ static const filtercheck_field_info sinsp_filter_check_thread_fields[] =
{PT_CHARBUF, EPF_NONE, PF_NA, "proc.exe", "First Argument", "The first command line argument argv[0] (truncated after 4096 bytes) which is usually the executable name but it could be also a custom string, it depends on what the user specifies. This field is collected from the syscalls args or, as a fallback, extracted from /proc/PID/cmdline."},
{PT_CHARBUF, EPF_NONE, PF_NA, "proc.pexe", "Parent First Argument", "The proc.exe (first command line argument argv[0]) of the parent process."},
{PT_CHARBUF, EPF_ARG_ALLOWED, PF_NA, "proc.aexe", "Ancestor First Argument", "The proc.exe (first command line argument argv[0]) for a specific process ancestor. You can access different levels of ancestors by using indices. For example, proc.aexe[1] retrieves the proc.exe of the parent process, proc.aexe[2] retrieves the proc.exe of the grandparent process, and so on. The current process's proc.exe line can be obtained using proc.aexe[0]. When used without any arguments, proc.aexe is applicable only in filters and matches any of the process ancestors. For instance, you can use `proc.aexe endswith java` to match any process ancestor whose proc.exe ends with the term `java`."},
{PT_CHARBUF, EPF_ARG_ALLOWED, PF_NA, "proc.concat_aexe", "Ancestor First Argument Concatenated", "The proc.exe lineage is traced back to a specific process ancestor, up to the specified level of ancestors. The result is a string representing the concatenated lineage, with '->' as the delimiter, starting with the furthest ancestor and including the current process. For example, proc.aexe_concat[3] returns the string `/bin/java->/bin/bash->/bin/python->/bin/bash`, assuming a java process launched a bash process that ultimately launched python and finally bash."},
{PT_CHARBUF, EPF_NONE, PF_NA, "proc.exepath", "Process Executable Path", "The full executable path of the process (it could be truncated after 1024 bytes if read from '/proc'). This field is collected directly from the kernel or, as a fallback, extracted resolving the path of /proc/PID/exe, so symlinks are resolved. If you are using eBPF drivers this path could be truncated due to verifier complexity limits. (legacy eBPF kernel version < 5.2) truncated after 24 path components. (legacy eBPF kernel version >= 5.2) truncated after 48 path components. (modern eBPF kernel) truncated after 96 path components."},
{PT_CHARBUF, EPF_NONE, PF_NA, "proc.pexepath", "Parent Process Executable Path", "The proc.exepath (full executable path) of the parent process."},
{PT_CHARBUF, EPF_ARG_ALLOWED, PF_NA, "proc.aexepath", "Ancestor Executable Path", "The proc.exepath (full executable path) for a specific process ancestor. You can access different levels of ancestors by using indices. For example, proc.aexepath[1] retrieves the proc.exepath of the parent process, proc.aexepath[2] retrieves the proc.exepath of the grandparent process, and so on. The current process's proc.exepath line can be obtained using proc.aexepath[0]. When used without any arguments, proc.aexepath is applicable only in filters and matches any of the process ancestors. For instance, you can use `proc.aexepath endswith java` to match any process ancestor whose path ends with the term `java`."},
{PT_CHARBUF, EPF_ARG_ALLOWED, PF_NA, "proc.concat_aexepath", "Ancestor Executable Path Concatenated", "The proc.exepath lineage is traced back to a specific process ancestor, up to the specified level of ancestors. The result is a string representing the concatenated lineage, with '->' as the delimiter, starting with the furthest ancestor and including the current process. For example, proc.aexepath_concat[3] returns the string `/bin/java->/bin/bash->/bin/python->/bin/bash`, assuming a java process launched a bash process that ultimately launched python and finally bash."},
{PT_CHARBUF, EPF_NONE, PF_NA, "proc.name", "Name", "The process name (truncated after 16 characters) generating the event (task->comm). Truncation is determined by kernel settings and not by Falco. This field is collected from the syscalls args or, as a fallback, extracted from /proc/PID/status. The name of the process and the name of the executable file on disk (if applicable) can be different if a process is given a custom name which is often the case for example for java applications."},
{PT_CHARBUF, EPF_NONE, PF_NA, "proc.pname", "Parent Name", "The proc.name truncated after 16 characters) of the process generating the event."},
{PT_CHARBUF, EPF_ARG_ALLOWED, PF_NA, "proc.aname", "Ancestor Name", "The proc.name (truncated after 16 characters) for a specific process ancestor. You can access different levels of ancestors by using indices. For example, proc.aname[1] retrieves the proc.name of the parent process, proc.aname[2] retrieves the proc.name of the grandparent process, and so on. The current process's proc.name line can be obtained using proc.aname[0]. When used without any arguments, proc.aname is applicable only in filters and matches any of the process ancestors. For instance, you can use `proc.aname=bash` to match any process ancestor whose name is `bash`."},
{PT_CHARBUF, EPF_ARG_ALLOWED, PF_NA, "proc.concat_aname", "Ancestor Name Concatenated", "The proc.name (truncated after 16 characters) lineage is traced back to a specific process ancestor, up to the specified level of ancestors. The result is a string representing the concatenated lineage, with '->' as the delimiter, starting with the furthest ancestor and including the current process. For example, proc.aname_concat[3] returns the string `java->bash->python->bash`, assuming a java process launched a bash process that ultimately launched python and finally bash."},
{PT_CHARBUF, EPF_NONE, PF_NA, "proc.args", "Arguments", "The arguments passed on the command line when starting the process generating the event excluding argv[0] (truncated after 4096 bytes). This field is collected from the syscalls args or, as a fallback, extracted from /proc/PID/cmdline."},
{PT_CHARBUF, EPF_NONE, PF_NA, "proc.cmdline", "Command Line", "The concatenation of `proc.name + proc.args` (truncated after 4096 bytes) when starting the process generating the event."},
{PT_CHARBUF, EPF_NONE, PF_NA, "proc.pcmdline", "Parent Command Line", "The proc.cmdline (full command line (proc.name + proc.args)) of the parent of the process generating the event."},
Expand Down Expand Up @@ -156,7 +159,10 @@ int32_t sinsp_filter_check_thread::extract_arg(std::string fldname, std::string
m_field_id == TYPE_ANAME ||
m_field_id == TYPE_AEXE ||
m_field_id == TYPE_AEXEPATH ||
m_field_id == TYPE_ACMDLINE)
m_field_id == TYPE_ACMDLINE ||
m_field_id == TYPE_ANAME_CONCAT ||
m_field_id == TYPE_AEXE_CONCAT ||
m_field_id == TYPE_AEXEPATH_CONCAT)
{
if(val[fldname.size()] == '[')
{
Expand Down Expand Up @@ -280,6 +286,28 @@ int32_t sinsp_filter_check_thread::parse_field_name(const char* str, bool alloc_

return res;
}
else if(STR_MATCH("proc.concat_aname"))
{
m_field_id = TYPE_ANAME_CONCAT;
m_field = &m_info.m_fields[m_field_id];

int32_t res = 0;

try
{
res = extract_arg("proc.concat_aname", val, NULL);
}
catch(...)
{
if(val == "proc.concat_aname")
{
m_argid = -1;
res = (int32_t)val.size();
}
}

return res;
}
else if(STR_MATCH("proc.aexepath"))
{
m_field_id = TYPE_AEXEPATH;
Expand All @@ -302,6 +330,28 @@ int32_t sinsp_filter_check_thread::parse_field_name(const char* str, bool alloc_

return res;
}
else if(STR_MATCH("proc.concat_aexepath"))
{
m_field_id = TYPE_AEXEPATH_CONCAT;
m_field = &m_info.m_fields[m_field_id];

int32_t res = 0;

try
{
res = extract_arg("proc.concat_aexepath", val, NULL);
}
catch(...)
{
if(val == "proc.concat_aexepath")
{
m_argid = -1;
res = (int32_t)val.size();
}
}

return res;
}
/* note: because of str similarity of proc.aexe to proc.aexepath, this needs to be placed after proc.aexepath */
else if(STR_MATCH("proc.aexe"))
{
Expand All @@ -325,6 +375,28 @@ int32_t sinsp_filter_check_thread::parse_field_name(const char* str, bool alloc_

return res;
}
else if(STR_MATCH("proc.concat_aexe"))
{
m_field_id = TYPE_AEXEPATH_CONCAT;
m_field = &m_info.m_fields[m_field_id];

int32_t res = 0;

try
{
res = extract_arg("proc.concat_aexe", val, NULL);
}
catch(...)
{
if(val == "proc.concat_aexe")
{
m_argid = -1;
res = (int32_t)val.size();
}
}

return res;
}
else if(STR_MATCH("proc.acmdline"))
{
m_field_id = TYPE_ACMDLINE;
Expand Down Expand Up @@ -1146,6 +1218,37 @@ uint8_t* sinsp_filter_check_thread::extract(sinsp_evt *evt, OUT uint32_t* len, b
m_tstr = mt->get_comm();
RETURN_EXTRACT_STRING(m_tstr);
}
case TYPE_ANAME_CONCAT:
{
sinsp_threadinfo* mt = NULL;

if(tinfo->is_main_thread())
{
mt = tinfo;
}
else
{
mt = tinfo->get_main_thread();

if(mt == NULL)
{
return NULL;
}
}

m_tstr = mt->get_comm();
for(int32_t j = 0; j < m_argid; j++)
{
mt = mt->get_parent_thread();

if(mt == NULL)
{
RETURN_EXTRACT_STRING(m_tstr);
}
m_tstr = mt->get_comm() + "->" + m_tstr;
}
RETURN_EXTRACT_STRING(m_tstr);
}
case TYPE_PEXE:
{
sinsp_threadinfo* ptinfo =
Expand Down Expand Up @@ -1192,6 +1295,37 @@ uint8_t* sinsp_filter_check_thread::extract(sinsp_evt *evt, OUT uint32_t* len, b
m_tstr = mt->get_exe();
RETURN_EXTRACT_STRING(m_tstr);
}
case TYPE_AEXE_CONCAT:
{
sinsp_threadinfo* mt = NULL;

if(tinfo->is_main_thread())
{
mt = tinfo;
}
else
{
mt = tinfo->get_main_thread();

if(mt == NULL)
{
return NULL;
}
}

m_tstr = mt->get_exe();
for(int32_t j = 0; j < m_argid; j++)
{
mt = mt->get_parent_thread();

if(mt == NULL)
{
RETURN_EXTRACT_STRING(m_tstr);
}
m_tstr = mt->get_exe() + "->" + m_tstr;
}
RETURN_EXTRACT_STRING(m_tstr);
}
case TYPE_PEXEPATH:
{
sinsp_threadinfo* ptinfo =
Expand Down Expand Up @@ -1238,6 +1372,37 @@ uint8_t* sinsp_filter_check_thread::extract(sinsp_evt *evt, OUT uint32_t* len, b
m_tstr = mt->get_exepath();
RETURN_EXTRACT_STRING(m_tstr);
}
case TYPE_AEXEPATH_CONCAT:
{
sinsp_threadinfo* mt = NULL;

if(tinfo->is_main_thread())
{
mt = tinfo;
}
else
{
mt = tinfo->get_main_thread();

if(mt == NULL)
{
return NULL;
}
}

m_tstr = mt->get_exepath();
for(int32_t j = 0; j < m_argid; j++)
{
mt = mt->get_parent_thread();

if(mt == NULL)
{
RETURN_EXTRACT_STRING(m_tstr);
}
m_tstr = mt->get_exepath() + "->" + m_tstr;
}
RETURN_EXTRACT_STRING(m_tstr);
}
case TYPE_LOGINSHELLID:
{
sinsp_threadinfo* mt = NULL;
Expand Down
3 changes: 3 additions & 0 deletions userspace/libsinsp/sinsp_filtercheck_thread.h
Original file line number Diff line number Diff line change
Expand Up @@ -28,12 +28,15 @@ class sinsp_filter_check_thread : public sinsp_filter_check
TYPE_EXE = 0,
TYPE_PEXE,
TYPE_AEXE,
TYPE_AEXE_CONCAT,
TYPE_EXEPATH,
TYPE_PEXEPATH,
TYPE_AEXEPATH,
TYPE_AEXEPATH_CONCAT,
TYPE_NAME,
TYPE_PNAME,
TYPE_ANAME,
TYPE_ANAME_CONCAT,
TYPE_ARGS,
TYPE_CMDLINE,
TYPE_PCMDLINE,
Expand Down
9 changes: 9 additions & 0 deletions userspace/libsinsp/test/events_proc.ut.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -352,6 +352,9 @@ TEST_F(sinsp_with_test_input, spawn_process)
ASSERT_EQ(get_field_as_string(evt, "proc.name"), "test-exe");
ASSERT_EQ(get_field_as_string(evt, "proc.aname[0]"), "test-exe");
ASSERT_EQ(get_field_as_string(evt, "proc.aname"), "test-exe");
ASSERT_EQ(get_field_as_string(evt, "proc.concat_aname[0]"), "test-exe");
ASSERT_EQ(get_field_as_string(evt, "proc.concat_aname[2]"), "init->test-exe");
ASSERT_EQ(get_field_as_string(evt, "proc.concat_aname"), "test-exe");

// check that the pid is updated
ASSERT_EQ(get_field_as_string(evt, "proc.pid"), "20");
Expand All @@ -363,11 +366,17 @@ TEST_F(sinsp_with_test_input, spawn_process)
ASSERT_EQ(get_field_as_string(evt, "proc.exe"), "/bin/test-exe");
ASSERT_EQ(get_field_as_string(evt, "proc.aexe[0]"), "/bin/test-exe");
ASSERT_EQ(get_field_as_string(evt, "proc.aexe"), "/bin/test-exe");
ASSERT_EQ(get_field_as_string(evt, "proc.concat_aexe[0]"), "/bin/test-exe");
ASSERT_EQ(get_field_as_string(evt, "proc.concat_aexe[2]"), "/sbin/init->/bin/test-exe");
ASSERT_EQ(get_field_as_string(evt, "proc.concat_aexe"), "/bin/test-exe");

// check that the exepath is updated
ASSERT_EQ(get_field_as_string(evt, "proc.exepath"), "/bin/test-exe");
ASSERT_EQ(get_field_as_string(evt, "proc.aexepath[0]"), "/bin/test-exe");
ASSERT_EQ(get_field_as_string(evt, "proc.aexepath"), "/bin/test-exe");
ASSERT_EQ(get_field_as_string(evt, "proc.concat_aexepath[0]"), "/bin/test-exe");
ASSERT_EQ(get_field_as_string(evt, "proc.concat_aexepath[2]"), "/sbin/init->/bin/test-exe");
ASSERT_EQ(get_field_as_string(evt, "proc.concat_aexepath"), "/bin/test-exe");

// check session leader (sid) related fields
ASSERT_EQ(get_field_as_string(evt, "proc.sid"), "0");
Expand Down

0 comments on commit 7dd75f2

Please sign in to comment.