2007-01-21

regex in Linux with C

在Linux中,很多应用程序都需要对正则表达式的支持(如:grep,sed,awk等)。所以提供有POSIX式regex支持:
#include <sys/types.h>
#include <regex.h>

/* is  used to compile a regular expression into a form that is
       suitable for subsequent regexec() searches. */
int    regcomp(regex_t *preg, const char *regex, int cflags);

如果是正确的正则式返回0

/* used to match a null-terminated string against the precom-
       piled  pattern  buffer */
int    regexec(const  regex_t  *preg,  const  char *string, size_t nmatch, regmatch_t pmatch[], int eflags);

成功匹配返回0,pmatch[0].rm_so,pmatch[0].rm_eo分别是第一个匹配串在string中的始终位置(-1表示没有匹配的)。
注意:这里pmatch[i](i=1,2,...)不是表示第二,三…个匹配,而是第一个匹配串中的子匹配。如果要找之后的匹配应该从第一个匹配的终位置开始在string中再次regexec()

/* free the memory allocated to the pattern buffer by  the  compiling  process */
void   regfree(regex_t *preg);

用完正则表达式后,或者要使用新的正则表达式的时候,我们可以用这个函数清空preg指向的regex_t结构体的内容,请记住,如果是使用新的正则表达式,一定要先清空regex_t结构体。

/* turn the error codes that can be returned by both regcomp() and regexec() */
size_t regerror(int errcode, const regex_t *preg, char *errbuf,  size_t errbuf_size);

返回regcomp/regexec的错误信息,其中errcode是regcomp/regexec的返回,errbuf是最后得到的错误信息。

使用:
先用regcomp()初始化正则式,然后用regexec()查找匹配串,最后别忘了用regfree()清除正则式。
有错误的话用regerror()来获取错误信息。例:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <regex.h>

#define SUBSLEN    10
#define EBUFLEN    128    /* error buffer length */
#define BUFLEN    1024    /* matched buffer length */

int
main (int argc, char **argv)
{
    FILE *fp;
    size_t len=0;    /* store error message length */
    regex_t re;    /* store compilned regular expression */
    regmatch_t subs[SUBSLEN];    /* store matched string position */
    char matched[BUFLEN];    /* store matched strings */
    char errbuf[EBUFLEN];    /* store error message */
    int err, i;

    char string[] = "AAaba(125){3a}babAbAbCdCd123123  11(923){82}aslfk(72){4}";
    char pattern[] = "(\\([0-9]+\\))(\\{[0-9]+\\}{1})";
/*注意: 对于C/C++的正则式,这里要注意的是‘\’字符,因为C中‘\’是转义符,正则式中‘\’也是转义符,所以要匹配的信息中有‘\’时,C的正则式中就要用“\\\\”(C字符串“\\\\”=>正则式“\\”=>字符“\”)
*/

    printf ("String: %s\n", string);
    printf ("Pattern: \"%s\"\n", pattern);

    /* compile regular expression */
    err = regcomp (&re, pattern, REG_EXTENDED);

    if (err) {
        len = regerror (err, &re, errbuf, sizeof(errbuf));
        fprintf (stderr, "error: regcomp: %s\n", errbuf);
        return (1);
    }
    printf ("Total has subexpression: %d\n", re.re_nsub);
int offset = 0;
while (1) {
    /* execute pattern match */
    err = regexec (&re, string+offset, (size_t)SUBSLEN, subs, 0);
    if (err == REG_NOMATCH) {
        fprintf (stderr, "Sorry, no match ...\n");
        regfree (&re);
        return (0);
    } else if (err) {
        len = regerror (err, &re, errbuf, sizeof (errbuf));
        fprintf (stderr, "error: regexec: %s\n", errbuf);
        return (1);
    }

    /* if no REG_NOMATCH and no error, then pattern matched */
    printf ("\nOK, has matched ...\n\n");
    for (i=0; i<=re.re_nsub; i++) {
        if (i==0) {
            printf ("begin: %d, end: %d, ", subs[i].rm_so, subs[i].rm_eo);
        } else {
            printf ("subexpression %d begin: %d, end: %d, ", i, subs[i].rm_so, subs[i].rm_eo);
        }
        len = (int)subs[i].rm_eo - (int)subs[i].rm_so;
        memcpy (matched, string + offset + subs[i].rm_so, len);
        matched[len] = '\0';
        printf ("match: %s\n", matched);
    }
offset += (int)subs[0].rm_eo;
} /* while (1) */
    regfree (&re);
    return 0;
}
//EOF Read More...

some messages echo

昨天看LFS启动脚本时想到的。
Linux启动时会在每一行信息后都输出一个"[  OK  ]"、"[ WARN ]"或"[ FAIL ]",这种输出方式在LFS启动脚本的function中可以得到解答:

#!/bin/bash
# output just like linux-bootup message ".... [  OK  ]"

## Screen Dimensions
# Find current screen size
if [ -z "${COLUMNS}" ]; then
        COLUMNS=$(stty size)
        COLUMNS=${COLUMNS##* }
fi

# When using remote connections, such as a serial port, stty size returns 0
if [ "${COLUMNS}" = "0" ]; then
        COLUMNS=80
fi

## Measurements for positioning result messages
COL=$((${COLUMNS} - 8))
WCOL=$((${COL} - 2))

## Set Cursor Position Commands, used via echo -e
SET_COL="\\033[${COL}G"      # at the $COL char
SET_WCOL="\\033[${WCOL}G"    # at the $WCOL char
CURS_UP="\\033[1A\\033[0G"   # Up one line, at the 0'th char


## Set color commands, used via echo -e
# Please consult `man console_codes for more information
# under the "ECMA-48 Set Graphics Rendition" section
#
# Warning: when switching from a 8bit to a 9bit font,
# the linux console will reinterpret the bold (1;) to
# the top 256 glyphs of the 9bit font.  This does
# not affect framebuffer consoles
NORMAL="\\033[0;39m"         # Standard console grey
SUCCESS="\\033[1;32m"        # Success is green
WARNING="\\033[1;33m"        # Warnings are yellow
FAILURE="\\033[1;31m"        # Failures are red
INFO="\\033[1;36m"           # Information is light cyan
BRACKET="\\033[1;34m"        # Brackets are blue

STRING_LENGTH="0"   # the length of the current message


echo -n -e "test test test .............. 11111"
echo -n " ||| now the fkjsal;kfa"
echo -e "${SET_COL}""${BRACKET}""[""${SUCCESS}""  OK  ""${BRACKET}""]""${NORMAL}"

echo -n -e "test est esttt esttt ------------ 2222"
echo -e "${SET_COL}""${BRACKET}""[""${FAILURE}"" FAIL ""${BRACKET}""]""${NORMAL}"

echo -n -e "test ewa tasfj a;ksljet;a ljkeakl;j"
echo -e "${SET_COL}""${BRACKET}""[""${WARNING}"" WARN ""${BRACKET}""]""${NORMAL}"


另外,以前看到过的一个旋转的棍子表示进度的,用这种方式:
#!/bin/bash
# a rout-line

if [ -z "${COLUMNS}" ]; then
        COLUMNS=$(stty size)
        COLUMNS=${COLUMNS##* }
fi
if [ "${COLUMNS}" = "0" ]; then
        COLUMNS=80
fi

COL=$((${COLUMNS} - 20))
SET_COL="\\033[${COL}G"      # at the $COL char

echo -n "asfdl;adfd;jja;:"
let n=0
while test $n -lt 5000 ; do
    let n=`expr $n+1`    # 这主要还起延时的作用,要不用`let n++`就行了
    echo -en "${SET_COL}/"
    let n=`expr $n+1`
    echo -en "${SET_COL}|"
    let n=`expr $n+1`
    echo -en "${SET_COL}\\"
    let n=`expr $n+1`
    echo -en "${SET_COL}-"
done
echo -e "${SET_COL}[ OK ]"

//EOF Read More...

2007-01-20

LFS单用户模式

今天发现LFS进单用户时会提示:

Give root password for maitenance
(or type Control-D to continue)

-_-还要输入密码!?和我平时印象中的单用户不同啊。
虽然在改grub的时候后面加个" init=/bin/bash"也可以进“单用户”。但还是不想这样麻烦。

单用户是[init 1]模式,启动脚本在/etc/rc.d/rc1.d/中,但光看这个目录里的东西是看不出什么的。
这里说一下,LFS的启动脚本和FC6的不同的地方。最大的不同是rcsysinit,LFS中和其他启动级别一样是一个目录的形式以'S'/'K'来判断启动与否(判断模式单一),FC6中是一个rc.sysinit的bash脚本(好处是:模式选则的多样性)
那么大多问题是出在这里,这里不想改动太多就在/etc/rc.d/init.d/中添加一脚本'single',链接之:
ln -sv ../init.d/single /etc/rc.d/rc1.d/S00single

single文件内容:
#!/bin/bash
echo "########## Entering Single Mode ##########"
/bin/bash

但注意,/etc/inittab中,需要:
...
l1:S1:wait:/etc/rc.d/init.d/rc 1
...

//EOF Read More...

2007-01-19

"sunday"

SGA_03_17
Carson Bechett dead T_T.(开始只是听说Carson会离开我们,没想到竟然是以这种方式)
"universe is a big place. who knows, another day we will meet each other again."
//EOF Read More...

2007-01-16

tar用法进阶

以前一直没注意tar的用法(一般都是tar -xf ...),今天irc上面有个人问“怎么把一个包内的指定文件解出来?”我才仔细读了下man tar
(新版的tar可以自动识别包的格式所以那个'z'(*.tar.gz),'j'(*.tar.bz2)可以省略)
查看一个包内的文件列表('-t'):
$ tar -tf test.tar.bz2
(古老的用法:$ bunzip2 -cd test.tar.bz2 |tar -t)
解出包内的/test/test_a目录(文件)('-x'):
$ tar -f test.tar.bz2 -jx test/test_a
(古老的用法:$ bunzip2 -cd test.tar.bz2 |tar -x test/test_a)

//Continue... Read More...

2007-01-14

新版bloger模板设置[2]

在新版中加入“分类”其实很简单,在'Blog Archive'下面添加:(你也可以加在别的地方)

<div class='widget-content'>
<h2 class='sidebar-title'>Classify</h2>
<div style='margin-left: 10px;'><ul>
<li><a href='http://dave3068.blogspot.com/search?label=Program'>Program</a></li>
<li><a href='http://dave3068.blogspot.com/search?label=C'>C</a></li>
<li><a href='http://dave3068.blogspot.com/search?label=GTK'>GTK</a></li>
<li><a href='http://dave3068.blogspot.com/search?label=Glib'>Glib</a></li>
<li><a href='http://dave3068.blogspot.com/search?label=LFS'>LFS</a></li>
<li><a href='http://dave3068.blogspot.com/search?label=kernel'>kernel</a></li>
<li><a href='http://dave3068.blogspot.com/search?label=xorg'>xorg</a></li>
<li><a href='http://dave3068.blogspot.com/search?label=bloger'>bloger</a></li>
<li><a href='http://dave3068.blogspot.com/search?label=dao'>dao</a></li>
<li><a href='http://dave3068.blogspot.com/search?label=Life'>Life</a></li>
</ul></div></div>

-_- 不方便的是每添加一个新分类,都要去手动编辑一次模板。
//EOF Read More...

tao

道,本意是道路。引申意很广,1、指法则、规律,与“器”(具体事物)相对“道器”,又与“德”(事物特殊规律)相对“道德”。2、宇宙万物的本质、本体。(<老子>“有物混成,天地生……可以为天下之母,吾不知其名,字之曰道”)3、一定的人生观、世界观、政治主张或思想体系(<卫灵公>“道不同,不相为谋”)4、方法。5、从,由。6、治理。(<论语-学而>“道千乘之国,敬事而信,节用爱人,使民以时”)7、通“导”,先导、疏导。8、讲,说。“一语道破”9、料想。(<精忠记-诛心>“我只道是地,湛湛青天不可欺”)
(道器,中国哲学的一对基本范畴。<易-系辞上>“形而上者谓之道,形而下者谓之器”。“道”是无形象的,含有规律和准则的意义。“器”是有形象的,指具体的事物或名物制度。
而“无其器则无其道”。)
以上摘自<辞海>。
//EOF
Read More...

2007-01-12

create a thread

#include <pthread.h>
int pthread_create(pthread_t *restrict thread,
              const pthread_attr_t *restrict attr,
              void *(*start_routine)(void*), void *restrict arg);

第一个参数是thread_id,第二个是attribut(一般为NULL),后两个是call_back func. & it's arg.

Example: [$ cc thread.c -othread -lpthread]
#include <pthread.h>
#include <stdio.h>
/* Prints x's to stderr. The parameter is unused. Does not return. */
void* print_xs (void* unused)
{
    while (1)
    fputc ('x', stderr);
    return NULL;
}
/* The main program. */
int main ()
{
    pthread_t thread_id;
    /* Create a new thread. The new thread will run the print_xs
    function. */
    pthread_create (&thread_id, NULL, &print_xs, NULL);
    /* Print o's continuously to stderr. */
    while (1)
    fputc ('o', stderr);
    return 0;
}

想想这个例子输出是什么?从理论上说应该是'x','o'相间无规律排列的字符序列。但我的电脑上运行后是整块整块的'x...','o...'排列(也许从宏观上来说是那样的没错,但总觉得似乎不太理想)。无非是两种情况:时间片和资源争用。而资源争用在这里应该不明显,毕竟原子操作就只是单字符输出。但对于时间片,这也太大了吧~~,不过想想也是哈,现在的CPU动不动就2G、3G的,一个时间片对人类来说没多少但CPU已经可以执行很多条指令了。(现在只好这么解译了)

另外还有:
1. pthread_join
#include <pthread.h>
int pthread_join(pthread_t thread, void **value_ptr);

用于等待thread的返回,第一个参数是thread_id,第二个是thread的返回值(若没有则为NULL)

2. pthread_equal
#include <pthread.h>
int pthread_equal(pthread_t t1, pthread_t t2);

用于比较两个thread_id是否相同(在判断是否是当前线程时比较有用),如:
    if (!pthread_equal (pthread_self (), other_thread_id))
        pthread_join (other_thread_id, NULL);


3. Thread Attributes,如:
pthread_attr_t attr;
pthread_t thread;
pthread_attr_init (&attr);
pthread_attr_setdetachstate (&attr, PTHREAD_CREATE_DETACHED);
pthread_create (&thread, &attr, &thread_function, NULL);
pthread_attr_destroy (&attr);


4. Thread Cancellation
A canceled thread may later be joined; in fact, you should join a canceled
thread to free up its resources, unless the thread is detached.The return value of a canceled thread is the special value given by PTHREAD_CANCELED.
A thread may be in one of three states with regard to thread cancellation.
1) The thread may be asynchronously cancelable.The thread may be canceled at any point in its execution.
2) The thread may be synchronously cancelable.The thread may be canceled, but not at just any point in its execution. Instead, cancellation requests are queued, and the thread is canceled only when it reaches specific points in its execution.
3) A thread may be uncancelable. Attempts to cancel the thread are quietly ignored.
When initially created, a thread is synchronously cancelable.
pthread_setcanceltype (PTHREAD_CANCEL_ASYNCHRONOUS, NULL);
pthread_setcancelstate (PTHREAD_CANCEL_DISABLE, NULL);

4.2.3 When to Use Thread Cancellation
In general, it’s a good idea not to use thread cancellation to end the execution of a
thread, except in unusual circumstances. During normal operation, a better strategy is
to indicate to the thread that it should exit, and then to wait for the thread to exit on
its own in an orderly fashion.We’ll discuss techniques for communicating with the
thread later in this chapter, and in Chapter 5,“Interprocess Communication.”
例:
int process_transaction (int from_acct, int to_acct, float dollars)
{
    int old_cancel_state;
/* Check the balance in FROM_ACCT. */
    if (account_balances[from_acct] < dollars)
        return 1;
/* Begin critical section. */
    pthread_setcancelstate (PTHREAD_CANCEL_DISABLE, &old_cancel_state);
/* Move the money. */
    account_balances[to_acct] += dollars;
    account_balances[from_acct] -= dollars;
/* End critical section. */
    pthread_setcancelstate (old_cancel_state, NULL);
    return 0;
}

(P.96 in <Advanced Linux Programming>)
and "Thread in C++" is P.100

还有:
pthread_mutex_t mutex;
pthread_mutex_init (&mutex, NULL);

例:P.104
struct job {
    /* Link field for linked list. */
    struct job* next;
/* Other fields describing work to be done... */
};
/* A linked list of pending jobs. */
struct job* job_queue;
/* A mutex protecting job_queue. */
pthread_mutex_t job_queue_mutex = PTHREAD_MUTEX_INITIALIZER;
/* Process queued jobs until the queue is empty. */
void* thread_function (void* arg)
{
    while (1) {
        struct job* next_job;
        /* Lock the mutex on the job queue. */
        pthread_mutex_lock (&job_queue_mutex);
        /* Now it’s safe to check if the queue is empty. */
        if (job_queue == NULL)
            next_job = NULL;
        else {
            /* Get the next available job. */
            next_job = job_queue;
            /* Remove this job from the list. */
            job_queue = job_queue->next;
        }
        /* Unlock the mutex on the job queue because we’re done with the
         * queue for now. */
        pthread_mutex_unlock (&job_queue_mutex);
        /* Was the queue empty? If so, end the thread. */
        if (next_job == NULL)
            break;
        /* Carry out the work. */
        process_job (next_job);
        /* Clean up. */
        free (next_job);
    }
return NULL;
}


//EOF
Read More...

2007-01-08

Gtk.TextBuffer undo

// 对于TextBuffer的Modify处理这里是关键,使之能对文本的改变做出响应
static gint undo_connect_signal(GtkTextBuffer *buffer)
{
    g_signal_connect(G_OBJECT(buffer), "delete-range",
        G_CALLBACK(cb_delete_range), buffer);
    g_signal_connect_after(G_OBJECT(buffer), "insert-text",
        G_CALLBACK(cb_insert_text), buffer);
    return
    g_signal_connect(G_OBJECT(buffer), "modified-changed",
        G_CALLBACK(cb_modified_changed), NULL);
}

// 还有:
static void undo_check_step_modif(GtkTextBuffer *buffer)
{
    if (g_list_length(undo_list) == step_modif) {
        g_signal_handlers_block_by_func(G_OBJECT(buffer), G_CALLBACK(cb_modified_changed), NULL);
        gtk_text_buffer_set_modified(buffer, FALSE);
        g_signal_handlers_unblock_by_func(G_OBJECT(buffer), G_CALLBACK(cb_modified_changed), NULL);
        set_main_window_title_with_asterisk(FALSE);
    }
}

continue... Read More...

time

时光飞逝,不知不觉中一年的时间过去了,一周的时间又过去了。
小时候不断的望着时间快点过,现在却觉得时间过得太快,不知年老的时候会怎么想呢?到时候一定希望时间能够倒转吧。 Read More...

2007-01-07

kernel video option

1. Framebuffer:
用 vesafb ,别的 xxfb 都不要。只留 vesafb ,还不能是 m 。

2.xorg rendering:

Processor type and features --->
<*> MTRR (Memory Type Range Register) support
Character devices --->
<M> /dev/agpgart (AGP Support)
[M] SIS... support
(Enable your chipset instead of the above.)
< > Direct Rendering Manager (XFree86 4.1.0 and higher DRI support)
**注意我这里还是要"M"后xorg才有"rendering enabled"
(看网上说的不要选,我想可能是指使用ATI官方驱动的时候吧?)
(Make sure the Direct Rendering Manager (DRM) is off. The X11-DRM package will provide its own.)

还有xorg7.1的driver里xf86-video除了ati还要编译chips才行。(rendition那个我也编译了)
//EOF Read More...

晕 xdm "Login incorrect"

最近在LFS里重新编译了一次xorg(原来是6.9的现在换成7.1)
之后发现xdm竟然不能登录了!!??提示Login incorrect...我确信我的pswd没错啊。。。
在Console下倒是能够登录的,init 3后重新编译了一遍xdm(加上了--with-pam),再次incorrect
er...我要崩溃了@@
应该是PAM或shadow的问题吧,我进/etc后突然发现没得pam的配置文件和pam.d目录。。。
郁闷啊~难道是我最开始做LFS的时候忘记搞pam的配置文件了??
没道理啊~开始用了那么久的LFS都没问题的,以前6.9的xdm也能登录啊,晕
于是重新做了pam配置后xdm能正常登录了
//EOF Read More...

2007-01-04

断网几天后,有所好转!

自从2006-12-27台湾地震搞断海底光缆开始,和国外网络中断后今天终见好转。
今天惊奇的发现bloger可以访问了,速度还不慢 :D Read More...

2007-01-02

Glib读取文本文件

gnotepad 开始第3天,遇到些许困难做不下去了(主要是undo的实现),于是打算下一个简单点的XFCE的mousepad源代码看看(开源软件的好处 :))。
费尽千辛万苦,以最大1K/s的速度终于下载下来mousepad的源码——台湾地震断网后访问国外的网站真是困难,光打开个页面找到mousepad就花费了我1个多小时 -_-|||

哈,下载下来之后惊奇的发现我的gnotepad和mousepad的源码结构出奇的相似 :D,这也难怪,都是参照Win的notepad,以GTK+为基础做的。。。
看了之后才发现Glib中还有 g_file_get_contents() 可以获取文件内容的,这比我原来用 g_fopen + fread 好多了(我就说Glib中怎么有g_fopen怎么没得读写文件的函数呢?)
gboolean g_file_get_contents (const gchar *filename, gchar **contents, gsize *length, GError **error);
gboolean g_file_set_contents (const gchar *filename, gchar *contents, gssize length, GError **error);
[Since v2.8]
不过写文件他还是用的 fopen + fputs,和我的差不多。
之所以不用g_file_set_contents()是因为length不好确定(特别是在Muti-byte-char和ASCII混编时,用下面的更简便)
还有我想要的编码转换部分,g_convert()(不过最后还是决定不用编码转换了)
呵呵,也找到了我正需要的modify部分的代码和undo,redo的代码,还是需要研究一下。
//EOF
Read More...