yara规则学习笔记（二）

本文最后更新于：2021-09-22 下午

条件

可以使用and,or以及not等来进行布尔运算，关系运算符>=,<=,<,>,==和!=，算数运算符+,-,*,\,%，，按位运算符&,|,<<,>>,~,^。

字符串标识符也可以在条件中使用，充当布尔变量，其值取决于文件中相关字符串的存在与否。

rule test
{
	strings:
		$a = "asd" 
		$b = "123"
		$c = "zxc"
		$d = "987"
	condition:
		($a or $b) and ($c or $d)
}

上述规则，$a或者$b的任何字符串命中以及同时$c或者$d命中就是满足条件。

计数字符串

有时候不仅仅需要知道某个字符串是否存在，还需要知道该字符串在文件或者进程内存中出现了多少次。每个字符串出现的次数由一个变量表示，该变量的名称是字符串标识符，用#字符代替了$字符。

rule test
{
	strings:
		$a = "test"
		$b = "123"
	condition:
		#a == 6 and #b < 3
}

当a字符串刚好出现了六次，且b字符串出现次数小于三次。

字符串偏移或虚拟地址

有些情况下，我们需要知道字符串是否在某个特定的偏移量文件或者进程地址空间内的某个虚拟地址，这种时候就可以使用at操作符。

rule test
{
	strings:
		$a = "test"
		$b = "123"
	condition:
		$a at 0xA and $b at 0x10
}

上述规则是“test”在文件偏移为0xA或者是进程在虚拟地址0xA的地方，”123”同理

at是指定具体的地址，而使用in可以指定一个范围

rule test
{
	strings:
		$a = "test"
		$b = "123"
	condition:
		$a in (0..10) and $b in (10..filesize)
}

这条规则是a字符串在偏移为0-10的地方且b字符串在10-文件末尾的地方。

如果把123在文件的开头，test在偏移10以后的位置，那么就无法检出了。

交换一下位置即可检出

匹配长度

在实际测试中，并不能成功，暂时还不知道为什么，先留个坑，等搞明白再写。

文件大小

字符串标识符并不是唯一可以出现在条件中的变量（规则可以在没有任何字符串定义的情况下定义），还有一些其他的特殊变量可以使用，其中之一是filesize，就是文件大小，大小以字节表示。

rule testFileSize
{
	condition:
		filesize > 5KB
}

这条规则只命中5KB以上大小的文件

示例中后缀使用的是KB，还可以使用MB

可执行入口点

还有一个特殊变量是entrypoint，但是现在已经弃用了，3.0版本之后都使用pe.entry_point来代替了，请注意。

关于入口点，不同编译器的入口点可能略有不同，需要自己进行分析之后再编写规则。

我的测试用程序是使用VS2019进行编译的。可以看到版本是14.29

使用调试器打开，可以看到入口点，查看其十六进制数据可以看到是E8 C5030000

编写规则

import "pe"
rule EntryPoint
{
	strings:
		$a = {E8 C5 03 00 00}
	condition:
		$a at pe.entry_point
}

成功检出

访问指定位置的数据

在多数情况下，需要检测特定文件偏移量或者虚拟内存地址的数据，可以从以下的函数之一来从给定的偏移量读取数据

int8(<offset or virtual address>)
int16(<offset or virtual address>)
int32(<offset or virtual address>)

uint8(<offset or virtual address>)
uint16(<offset or virtual address>)
uint32(<offset or virtual address>)

int8be(<offset or virtual address>)
int16be(<offset or virtual address>)
int32be(<offset or virtual address>)

uint8be(<offset or virtual address>)
uint16be(<offset or virtual address>)
uint32be(<offset or virtual address>)

int是读取8,16,32位有符号整数，uint是读取无符号整数，默认都是小端序，如果要读取大端序，在后面加上be

现在来尝试编写一个检测是否是PE文件的规则，我们知道PE开头都是“MZ”，0x5A4D，然后在DOS头部的e_lfanew标识的是PE标志的偏移量，这个地址相对于MZ头偏移量是0x3C。PE标志是“PE”，0x4550，现在可以尝试编写规则。他们俩就可以标识着此文件是否是一个PE文件。

rule IsPe
{
	condition:
		//MZ,word类型
		uint16(0) == 0x5A4D 
		//e_lfanew->pe Signature "PE",dword类型
		and uint32(uint32(0x3C)) == 0x00004550
}

字符串集

在某些情况下，需要在条件中指定，至少存在多少字符串命中才符合要求，此时可以使用of

rule test
{
	strings:
		$a = "test1"
		$b = "test2"
		$c = "test3"
	condition:
		2 of ($a,$b,$c)
}

这条规则要求，$a,$b,$c至少存在两个才行，任意组合都可以。

在集合中也是可以使用通配符的。

rule test
{
	strings:
		$test1 = "test1"
		$test2 = "test2"
		$test3 = "test3"
	condition:
		2 of ($test*)
}

也可以使用$*来表示规则中所有的字符串，也可以使用them

rule test1
{
	strings:
		$test1 = "test1"
		$test2 = "test2"
		$test3 = "test3"
	condition:
		2 of ($*)
}

rule test2
{
	strings:
		$test1 = "test1"
		$test2 = "test2"
		$test3 = "test3"
	condition:
		1 of them
}

在上面的示例中都使用了数字常量指定条件的字符串，也可以使用其他的表达式，any和all

all of them //所有规则中包含的字符串

any of them //任意字符串组合都可以，至少为一条

对多个字符串应用相同条件

还有一个功能强大的运算符for..of，可以对多个字符串匹配相同的条件，语法是这样的：

for a of b : (c)

在b字符串集合中，至少有a个字符串满足c的条件。

下面举一些例子

import "pe"
rule test
{
	strings:
		$a = {E8 C5 03 00 00}
		$b = {E8 01 00 00 00}
	condition:
		for any of ($a,$b) : ($ at pe.entry_point)
}

上述规则，是在a,b两个字符串集合中，至少有一个满足于pe.entry_point相同的条件。

rule test
{
	strings:
		$a = "test1" 
		$b = "test2" 
		$c = "test3" 
	condition:
		for any of them : (# > 2)
}

上述规则，abc这三个字符串，任意组合只要出现超过两次就可以，至少是一组。

使用匿名字符串

当使用of，for..of，指定了them时，分配给规则中的每个字符串的标识符通常是多余的，因为并没有单独的引用某个具体的字符串，所以可以不需要为每个字符串都提供标识符。在这些情况下，可以只使用仅有$字符组成的标识符声明匿名字符串

rule test
{
	strings:
		$ = "test1"
		$ = "test2"
	condition:
		any of them
}

迭代字符串

yara也可以写一个类似于C的FOR循环，语法是for..of

如下示例，可以检索所有PE文件中的区段，区段名为text的会命中规则。

import "pe"
rule test	
{
	condition:
		for any section in pe.sections : (section.name == ".text")
}

比如现在可以来检测是否存在UPX壳，UPX加壳后，会有一个区段是UPX，图为UPX3.96版本加壳后的区段信息。

import "pe"
rule test	
{
	condition:
		for any section in pe.sections : (section.name == "UPX0") 
}

b在文件中前三次出现的位置偏移必须比a多7。

rule test	
{
	strings:
		$a = "test1"
		$b = "test2"
	condition:
		for all i in (1,2,3) : (@a[i] + 7 == @b[i])
}

引用其他规则

在编写规则之前，还可以使用类似函数调用的方式，引用之前定义的规则。

rule test1
{
	strings:
		$a = "test1"
	condition:
		$a
}

rule test2
{
	strings:
		$a = "test2"
	condition:
		$a and test1
}

全局规则

全局规则可以一次对所有的规则进行限制，比如想要限制文件的大小，可以定义一条全局规则来进行限制，而不用单独在每个规则中都进行大小限制。

global rule sizelimit
{
	condition:
		filesize < 100KB
}

私有规则

私有规则只是在匹配给定的文件时，yara不会报告的规则，当一个规则引用了另一个规则时，可以使用私有规则来让它不要报告。

rule test1
{
	strings:
		$a = "test1"
	condition:
		$a
}

rule test2
{
	strings:
		$a = "test2"
	condition:
		$a and test1
}

看到结果只报告了test2，没有报告test1，正常情况是要报告的。

规则标签

yara可以向规则添加标签。这些标签用于过滤yara的输出并且只显示感兴趣的规则。

rule test1 : test
{
	strings:
		$a = "test1"
	condition:
		$a
}

rule test2 : b1ackie
{
	strings:
		$a = "test2"
	condition:
		$a and test1
}

元数据

除去字符串定义和条件部分，规则还可以有一个元数据部分，可以在其中放置有关规则的其他信息。元数据部分使用关键字meta。

rule test1 : test
{
	strings:
		$a = "test1"
	condition:
		$a
}

rule test2 : b1ackie
{
	strings:
		$a = "test2"
	condition:
		$a and test1
}

模块的使用

模块是yara核心功能的扩展，一些模块，比如PE，Cuckoo是由yara官方发布的，其他模块可以由第三方创建。使用模块的第一步是使用import导入它。

import “pe”

搭配如模块后，可以使用其功能，用法是modulename.func比如

pe.entry_point

pe.sections

外部变量

外部变量允许定义依赖于外部提供的值得规则。

rule ExternalVariableExample1
{
    condition:
        ext_var == 10
}

这种情况下ext_var是一个外部变量，通过-d参数来指定。

外部变量可以是这些类型：整数、字符串或布尔值。它们的类型取决于分配给它们的值。整数变量可以替代条件中的任何整数常量，布尔常量可以占据布尔表达式的位置。

rule ExternalVariableExample2
{
    condition:
        bool_ext_var or filesize < int_ext_var
}

字符串常量可以与这些运算符一起使用，

contains：如果字符串包含指定的子字符串，返回true

matches：如果字符串匹配给定的正则表达式时，返回true

rule ContainsExample
{
    condition:
        string_ext_var contains "text"
}
rule MatchesExample
{
    condition:
        string_ext_var matches /[a-z]+/
}