[linux仓库]线程控制

2025年 11月 21日中间件 SE_Wang

POSIX线程库
与线程有关的函数构成了⼀个完整的系列，绝⼤多数函数的名字都是以“pthread_”打头的
要使用这些函数库，要通过引入头文件<pthread.h>
链接这些线程函数库时要使⽤编译器命令的“-lpthread”选项
创建线程
int pthread_create(pthread_t *restrict thread, const pthread_attr_t *restrict attr, void *(*start_routine)(void *), void *restrict arg);

功能:创建⼀个新的线程

参数:

thread:返回线程ID(输出型参数),这个id是指lwp吗?并不是,lwp不需要暴露给用户,至于是什么下文会揭晓.
attr:设置线程的属性(优先级,栈大小之类)，attr为NULL表示使⽤默认属性
start_routine:回调函数(函数指针类型),线程启动后要执⾏的函数
arg:传给线程启动函数的参数
RETURN VALUE
On success, pthread_create() returns 0; on error, it returns an error number, and the contents of *thread are undefined.

返回值:成功返回0；失败返回错误码

错误检查:

传统的⼀些函数是，成功返回0，失败返回-1，并且对全局变量errno赋值以指示错误。
pthreads函数出错时不会设置全局变量errno（⽽⼤部分其他POSIX函数会这样做）。⽽是将错误代码通过返回值返回
pthreads同样也提供了线程内的errno变量，以⽀持其它使⽤errno的代码。对于pthreads函数的错误，建议通过返回值业判定，因为读取返回值要⽐读取线程内的errno变量的开销更小
举个梨子:

// 新线程
void *thread_routine(void *args)
{
std::string threadname = static_cast<const char *>(args);
while (true)
{
printf("new thread...\n");
sleep(1);
}
}

int main()
{
pthread_t tid;
int n = pthread_create(&tid, nullptr, thread_routine, (void *)"thread-1");
(void)n; // 避免警告

// 主线程
sleep(2);
while (true)
{
printf("main thread...\n");
sleep(1);
}
return 0;
}

运行该程序可以看到有两个执行流,这证明我们确实创建出了一个线程.

既然我们说id不是指lwp,那么又该如何证明呢?

pthread_t id;
int n = pthread_create(&id,nullptr,thread_routine,(void*)"thread-1");
(void)n;

printf("main create a new thread , new thread id is 0x%lx\n", id); // 不建议是%ld,因为数字会很大

ps -aL | head -1 && ps -aL | grep testThread

怎么理解这个“ID”呢？这个“ID”是 pthread 库给每个线程定义的进程内唯⼀标识，是 pthread 库维持的。

由于每个进程有⾃⼰独⽴的内存空间，故此“ID”的作⽤域是进程级⽽⾮系统级(内核不认识)。

其实 pthread 库也是通过内核提供的系统调⽤（例如clone）来创建线程的，⽽内核会为每个线程创建系统全局唯⼀的“ID”来唯⼀标识这个线程。

LWP 是什么呢？

LWP 得到的是真正的线程ID。之前使⽤ pthread_self 得到的这个数实际上是⼀个地址，在虚拟地址空间上的⼀个地址，通过这个地址，可以找到关于这个线程的基本信息，包括线程ID，线程栈，寄存器等属性。
在 ps -aL 得到的线程ID，有⼀个线程ID和进程ID相同，这个线程就是主线程，主线程的栈在虚拟地址空间的栈上，⽽其他线程的栈在是在共享区（堆栈之间），因为pthread系列函数都是pthread库提供给我们的。⽽pthread库是在共享区的。所以除了主线程之外的其他线程的栈都在共享区。
全局变量共享
int gval = 100;

// 新线程
void *thread_routine(void *args)
{
std::string threadname = static_cast<const char *>(args);

while (true)
{

printf("new thread...: gval: %d, &gval %p \n", gval, &gval);
gval++;
sleep(1);
}
}

int main()
{
pthread_t tid;
int n = pthread_create(&tid, nullptr, thread_routine, (void *)"thread-1");
(void)n; // 避免警告

// 主线程
sleep(2);
while (true)
{
printf("main thread... : gval: %d, &gval %p \n", gval, &gval);
sleep(1);
}
return 0;
}

在进程章节,我们也是做了类似的工作,但当时出现的情况是父子进程的虚拟地址是相同的,但打印出来的值是不同的;

在线程章节,我们会看到二者的虚拟地址是相同的,并且全局变量值是跟随着一方的改动而进行改动.

线程局部存储
thread_local int gval = 100;

每个线程都会单独维护一份该变量的副本；

不同线程中，该变量的值和地址都是相互独立的；

线程之间不会共享 thread_local 变量。

即使是同一个全局变量声明为 thread_local，不同线程中访问它时，也会得到不同的虚拟地址。

那么又是如何做到的呢?

thread_local 是 C++11 引入的关键字，由编译器（如 GCC / G++）在底层进行处理。

编译器会将该变量放入线程局部存储区，保证每个线程有自己的一份局部副本。

与普通 static 或 global 不同，thread_local 并不会被所有线程共享。

在 ELF 符号表中，thread_local 变量会有特殊标记，用来区分其存储方式。

函数共享
std::string fun()
{
return "我是另一个函数";
}

// 新线程
void *thread_routine(void *args)
{
std::string threadname = static_cast<const char *>(args);
while (true)
{
printf("new thread...: gval: %d, &gval %p, %s \n", gval, &gval, fun().c_str());
gval++;
sleep(1);
}
return nullptr; // 如果线程执行完了自己的入口函数，表明该线程退出
}

int main()
{
pthread_t tid;
int n = pthread_create(&tid, nullptr, thread_routine, (void *)"thread-1");
(void)n; // 避免警告

// 主线程
sleep(2);
while (true)
{
printf("main thread... : gval: %d, &gval %p, %s \n", gval, &gval, fun().c_str());
sleep(1);
}
return 0;
}

同样可以观察到两个线程重入同一个函数.那么是否也可以理解函数重入了?!

堆空间共享(原则上)
全局上申请堆空间:

int *data = new int(10);

新线程内部申请空间:

新线程函数内部申请堆空间,data变量是在栈上开辟的,只有这个线程能访问这个局部变量,知道堆空间的起始虚拟地址.那么我只要让其他线程知道这部分堆空间其实虚拟地址就可以了?这并不难做到啊!因此说原则上堆空间是共享的

一个线程崩溃,引起进程退出
// 新线程
void *thread_routine(void *args)
{
std::string threadname = static_cast<const char *>(args);
while (true)
{
cnt--;
if (cnt == 3)
{
// 野指针节引用
printf("%s is dead...\n", threadname.c_str());
int *p = nullptr;
*p = 0;
}
sleep(1);
}
return nullptr; // 如果线程执行完了自己的入口函数，表明该线程退出
}

线程等待
引入 -- 主线程先退,进程被回收
// 新线程
void *thread_routine(void *args)
{
std::string threadname = static_cast<const char *>(args);
while (true)
{
printf("new thread...\n");
sleep(1);
}
}

int main()
{
pthread_t tid;
int n = pthread_create(&tid, nullptr, thread_routine, (void *)"thread-1");
(void)n; // 避免警告

// 主线程
sleep(2);
while (true)
{
printf("main thread...\n");
sleep(1);
break;
}
return 0;
}

为什么主线程退出,整个进程会被进行回收呢?

main结束，表示主线程结束，同时也表示，当前进程结束 -- 释放资源，进程结束，所有线程全部退出，哪怕没有执行完.

在多线程代码中,为了避免主线程先退出,因此主线程需要对其他线程进行等待,类似wait,从而解决新线程得的内存泄漏问题(类似僵尸问题).

int pthread_join(pthread_t thread, void **retval);
功能:等待线程结束

参数:

thread:等待哪个线程
value_ptr:输出型参数,获得的是新线程函数的返回值void*
RETURN VALUE
On success, pthread_join() returns 0; on error, it returns an error number.

返回值：成功返回0；失败返回错误码

// 每个线程都有对应的名字
std::vector<pthread_t> tids;

void *thread_routine(void *args)
{
std::string name = static_cast<const char *>(args);
printf("new thread is running , name is : %s\n", name.c_str());
sleep(1);
return nullptr; // 如果线程执行完了自己的入口函数，表明该线程退出
}

int main()
{
// 创建10个线程
for (int i = 0; i < 10; i++)
{
pthread_t tid;
char idbuffer[64];
snprintf(name, 64, "thread-%d", i + 1);
int n = pthread_create(&tid, nullptr, thread_routine, name);
(void)n;
tids.push_back(tid);
}

sleep(1);
for (auto &tid : tids)
printf("main create a new thread, new thread id is : 0x%lx\n", tid);

// 多线程代码中，我们往往想让谁最后退出？？主线程！ --- 线程也要进行等待，类似进程wait!
// 要对新线程进行等待：也会造成类似僵尸进程的问题！
// 等待回收多个线程
for (auto &tid : tids)
{
pthread_join(tid, nullptr);
printf("thread end..., 退出的线程是: %lu\n", tid);
}

printf("wait new thread success...\n");
return 0;
}

在上面这段程序中,主线程创建了10个线程,并在最后进行了阻塞回收防止线程资源泄漏,并且我们想让每一个被创建出来的线程都打印出来各自的名字.于是,定义了idbuffer数组.

按常理来说,即使每个线程都不一定一创建就会运行,但是我也要看到thread由1置10才对,为什么会出现这种情况呢?我们来仔细剖析下里面的原因:

在多线程程序中，每个线程都应该拥有自己对应的 name。然而，从上图的输出结果可以看到，本应依次显示 thread-1 到 thread-9 的线程名，却出现了多个 thread-10。

造成这种情况的原因是：在新线程函数体中，线程的 name 是通过传入的 args 参数赋值的。但是，在部分线程执行到赋值操作之前，args 的内容已经被修改。具体来说，args 对应的是 idbuffer，而 idbuffer 在被复用时，被后来创建的线程（如 thread-10）写入了新的内容。

结果就是：

线程 1 ~ 9 在真正执行到 name = args 之前，idbuffer 已经被 thread-10 覆盖；

因此，剩下的线程在赋值时，读取到的都是 thread-10 的内容；

最终导致输出中出现了多个 “thread-10”。

那么该如何解决呢?只要让每一个线程占有自己申请的空间就行了!

for (int i = 0; i < 10; i++)
{
pthread_t tid;
char *name = new char[64]
snprintf(name, 64, "thread-%d", i + 1);
int n = pthread_create(&tid, nullptr, thread_routine, name);
(void)n;
tids.push_back(tid);
}

多线程全局变量共享!
多线程其他函数共享!
原则上，堆空间也是共享的!

线程崩溃问题:

健壮性降低:编写多线程需要更全⾯更深⼊的考虑，在⼀个多线程程序⾥，因时间分配上的细微偏差或者因共享了不该共享的变量⽽造成不良影响的可能性是很⼤的，换句话说线程之间是缺乏保护的。
单个线程如果出现除零，野指针问题导致线程崩溃，进程也会随着崩溃
线程是进程的执⾏分⽀，线程出异常，就类似进程出异常，进⽽触发信号机制，终⽌进程，进程终⽌，该进程内的所有线程也就随即退出
任何一个线程调用exit,都会导致进程退出,变相的导致所有线程退出.
线程终止
void pthread_exit(void *retval);

功能:线程终止

参数:retval不要指向⼀个局部变量.

返回值:⽆返回值，跟进程⼀样，线程结束的时候⽆法返回到它的调⽤者(自身)

需要注意,pthread_exit或者return返回的指针所指向的内存单元必须是全局的或者是⽤malloc分配的,不能在线程函数的栈上分配,因为当其它线程得到这个返回指针时线程函数已经退出了。

只终⽌某个线程⽽不终⽌整个进程,可以有三种⽅法:

pthread_exit(nullptr)
return nullptr
⼀个线程可以调⽤pthread_ cancel终⽌同⼀进程中的另⼀个线程。
在线程等待中,我们说过第二个参数可以拿到线程的退出信息,而这样做的目的就可以根据返回值的信息让主线程进行判断,从而决定下一步要怎么做.

// 新线程
pthread_exit((void*)0);

//主线程
int m = pthread_join(tid,&ret);

线程的退出信息?

还记得在进程章节我们讲过退出信息 = 信号 + 退出码,我当前能拿到退出码,可是怎么知道信号呢?实际上并不需要考虑信号,因为多线程这里，没机会考虑所谓的异常,因为一旦异常,整个进程就退出来,根本就执行不到pthread_join啊!

线程的返回值是数字0,但谁说返回信息只能是内置类型了?可以返回一个类吗?返回一个函数?(应用层面传参同样可以是任意类型，类对象也是可以的(这点一定要体会到!))

返回一个类:

// 返回一个类
class Task
{
public:
Task() : _x(0), _y(0), _result(0), _code(0)
{
}
Task(int x, int y) : _x(x), _y(y), _result(0), _code(0)
{
}
void Div()
{
if (_y == 0)
{
_code = 1; // 除0了
return;
}
_result = _x / _y;
}
void Print()
{
std::cout << "result: " << _result << "[" << _code << "]" << std::endl;
}
private:
int _x;
int _y;
int _result;
int _code;
};

class Result
{
};

// 需求：线程我也想要自己的"全局变量", 只想自己看到
void *start_routine(void *args)
{
std::string name = static_cast<const char *>(args);
Task *t = static_cast<Task *>(args);
while (true)
{
std::cout << "我是一个新线程, " << std::endl;
sleep(1);
break;
}
t->Div();
sleep(2);
return (void *)t; // 给线程返回一个类对象
}
int main()
{
pthread_t tid; // 未来pthread库中，表明线程控制块的起始地址
// 给线程传参
Task *t = new Task(20, 10);
pthread_create(&tid, nullptr, start_routine, (void *)t);
sleep(2);
while (true)
{
sleep(1);
std::cout << "我是一个主线程, " << std::endl;
break;
}
void *ret = nullptr;
int n = pthread_join(tid, &ret);
Task *result = (Task *)ret;
result->Print();
return 0;
}

返回一个函数:

// 定义一个函数类型
typedef int (*MathFunction)(int, int);

int add(int a, int b)
{
return a + b;
}

// 线程执行的函数，返回一个函数指针
void *thread_function(void *args)
{
std::string name = static_cast<const char *>(args);
printf("new thread is created , name is %s\n", name.c_str());
MathFunction result;
result = &add;

return (void *)result; // 返回函数指针
}

int main()
{
pthread_t tid;
MathFunction func;

// 创建新线程
int n = pthread_create(&tid, nullptr, thread_function, (void *)"thread-1");
(void)n;

// 等待线程结束并获取返回的函数指针
MathFunction ret;
pthread_join(tid, (void **)&ret);
printf("new thread wait success...\n");
// 使用返回的函数进行计算
int a = 10, b = 5;
int result = ret(a, b);
printf("计算结果: %d\n", result);

return 0;
}

线程取消
int pthread_cancel(pthread_t thread);
功能：取消⼀个执⾏中的线程
参数:thread:线程ID

RETURN VALUE
On success, pthread_cancel() returns 0; on error, it returns a nonzero error number.
返回值：成功返回0；失败返回错误码

主线程进行线程取消:

// 线程取消
void *thread_routine(void *args)
{
std::string name = static_cast<const char *>(args);
while (true)
{
printf("new thread name : %s\n", name.c_str());
sleep(1);
}
return nullptr;
}

int main()
{
pthread_t tid;
pthread_create(&tid, nullptr, thread_routine, (void *)"thread-1");
sleep(5);
// 取消线程
pthread_cancel(tid);
printf("new thread 被取消...\n");
return 0;
}

新线程还没运行到return nullptr,此时就被主线程取消了,那么pthread_join的第二个参数带出来的信息又是啥?

线程被其他线程取消，返回值是-1，#define PTHREAD CANCELED ((void *)-1).等待是成功的,但是返回信息默认被系统设置为-1

若thread线程是⾃⼰调⽤pthread_exit终⽌的,value_ptr所指向的单元存放的是传给pthread_exit的参数。

获取自身线程id
pthread_t pthread_self(void);
功能:返回⼀个 pthread_t 类型的变量，指代的是调⽤ pthread_self 函数的线程的 “ID”。

既然可以让主线程取消其他线程,那么线程是否能自己取消自己呢?是可以做到的,但是意义不大.

pthread_cancel(pthread_self());

线程分离
默认情况下，新创建的线程是joinable的，线程退出后，需要对其进⾏pthread_join操作，否则⽆法释放资源，从⽽造成系统泄漏。
如果不关⼼线程的返回值，join是⼀种负担，这个时候，我们可以告诉系统，当线程退出时，⾃动释放线程资源。
int pthread_detach(pthread_t thread);

既可以让主线程自己设置,其他线程也可以设置分离

应用场景:

因为真正的软件是一个死循环啊,主线程不会退,让其他线程自动被释放

多个不同的执行流都是为了完成进程这个任务的

在怎么分离也脱不开"线程"概念,也脱不开虚拟地址等

无论是joinable还是分离都是修改task struct内的一个宏值,代表不同的标志

系统调用号获取lwp
#define get_lwp_id() syscall(SYS_gettid)

__thread int lwpid;

// 用系统调用号拿到自己的id
void *thread_routine(void *args)
{
lwpid = get_lwp_id();
std::string name = static_cast<const char *>(args);
int cnt = 10;
while (true)
{
std::cout << "new thread create... " << "new thread id is: " << lwpid << std::endl;
sleep(1);
break;
}
return (void *)10;
}
int main()
{
lwpid = get_lwp_id();
pthread_t tid;
pthread_create(&tid, nullptr, thread_routine, (void *)"thread-1");

void *ret;
int n = pthread_join(tid, &ret);
(void)n;
std::cout << "join success " << "main thread id is: " << lwpid << std::endl;
sleep(10);
return 0;
}

线程库及pthread_join第二个参数

C++11 thread vs pthread
在Linux中，C++的多线程本质就是对pthread库在做封装!!!

而在macos,unix,windows等都有自己对线程库的实现啊!C++是否也需要对应封装一遍呢?

需要的,一款软件被发明出来，首要任务是什么?有人用，一直有人用，增大自己的客户覆盖群.

C++为什么要支持多线程啊?如何保证自己的多线程被更多人使用?

需要支持跨平台啊!

为什么需要支持跨平台?一种平台，背后就是一批用户!!!跨平台性之争，本质是用户之争!!

如何做到跨平台?

每个平台的多线程封装，都做一份!!!

这就是为什么一个语言支持新特性,需要很长时间且很难的原因啊!

线程ID、进程地址空间布局
pthread_ create函数会产⽣⼀个线程ID，存放在第⼀个参数指向的地址中。该线程ID和前⾯说的线程ID(LWP)不是⼀回事。
前⾯讲的线程ID属于进程调度的范畴。因为线程是轻量级进程，是操作系统调度器的最⼩单位，所以需要⼀个数值来唯⼀表⽰该线程。
pthread_ create函数第⼀个参数指向⼀个虚拟内存单元，该内存单元的地址即为新创建线程的线程ID，属于NPTL线程库的范畴。线程库的后续操作，就是根据该线程ID来操作线程的。
pthread_t 到底是什么类型呢？取决于实现。对于Linux⽬前实现的NPTL实现而言，pthread_t类型的线程ID，本质就是⼀个进程地址空间上的⼀个地址。

总结
本文详细介绍了Linux线程编程的核心知识，重点讲解了POSIX线程库(pthread)的使用方法。主要内容包括：线程创建与终止（pthread_create/pthread_exit）、线程等待与资源回收（pthread_join）、线程取消（pthread_cancel）以及线程分离（pthread_detach）。文章通过代码示例演示了线程间共享全局变量、函数和堆空间的特性，分析了线程局部存储(thread_local)的实现原理，并对比了线程ID与LWP的区别。最后探讨了C++多线程与pthread库的关系，指出C++11线程库本质是对各平台原生线程库的封装。全文深入浅出地讲解了Linux多线程编程的关键技术点，为开发者提供了实用的线程编程指导。

————————————————
版权声明：本文为CSDN博主「egoist2023」的原创文章，遵循CC 4.0 BY-SA版权协议，转载请附上原文出处链接及本声明。
原文链接：https://blog.csdn.net/egoist2023/article/details/154029264

作者：SE_Wang

链接：https://www.cnesa.cn/8894.html

文章版权归作者所有，未经允许请勿转载。