这是曾经碰到一个面试题,当时没回答上来。而且就是写这篇文档的现在,也不敢说掌握的资料就是面试官想要的答案。只能是比当时多一点理解,以供大家参考。
问题细节已记不太清,好像就是问memcpy在哪些时候,拷贝效率不高。
函数原型及描述
#include <string.h> void *memcpy(void *restrict dst, const void *restrict src, size_t n); DESCRIPTION The memcpy() function copies n bytes from memory area src to memory area dst. If dst and src overlap, behavior is undefined. Applications in which dst and src might overlap should use memmove(3) instead.
如果要考虑源地址和目的地址有重叠的情况,可使用memmove
#include <string.h> void *memmove(void *dst, const void *src, size_t len); DESCRIPTION The memmove() function copies len bytes from string src to string dst. The two strings may overlap; the copy is always done in a non-destructive manner.
附上memcpy的glibc实现
void *memcpy (void *dstpp, const void *srcpp, size_t len) { unsigned long int dstp = (long int) dstpp; unsigned long int srcp = (long int) srcpp; /* Copy from the beginning to the end. */ /* If there not too few bytes to copy, use word copy. */ if (len >= OP_T_THRES) //其中OP_T_THRES为16 { /* Copy just a few bytes to make DSTP aligned. */ len -= (-dstp) % OPSIZ; BYTE_COPY_FWD (dstp, srcp, (-dstp) % OPSIZ); /* Copy whole pages from SRCP to DSTP by virtual address manipulation, as much as possible. */ PAGE_COPY_FWD_MAYBE (dstp, srcp, len, len); /* Copy from SRCP to DSTP taking advantage of the known alignment of DSTP. Number of bytes remaining is put in the third argument, i.e. in LEN. This number may vary from machine to machine. */ WORD_COPY_FWD (dstp, srcp, len, len); /* Fall out and copy the tail. */ } /* There are just a few bytes to copy. Use byte memory operations. */ BYTE_COPY_FWD (dstp, srcp, len); return dstpp; } /* Copy exactly NBYTES bytes from SRC_BP to DST_BP, without any assumptions about alignment of the pointers. */ #define BYTE_COPY_FWD(dst_bp, src_bp, nbytes) \ do \ { \ size_t __nbytes = (nbytes); \ while (__nbytes > 0) \ { \ byte __x = ((byte *) src_bp)[0]; \ src_bp += 1; \ __nbytes -= 1; \ ((byte *) dst_bp)[0] = __x; \ dst_bp += 1; \ } \ } while (0)
这段代码表示,在数据量较少的拷贝,比如16个字节以内,也就是4个整形数据的长度,采用一个循环赋值的方式,逐字节的将源地址的数据赋值到目的地址上,且忽略字节对齐和重叠的情况。数据量较大的时候,会按页拷贝按字拷贝,当然这是根据PAGE_COPY_FWD_MAYBE函数和WORD_COPY_FWD函数推测的,这两个函数肯定会利用vector指令并行来优化。
那如果是小于16字节的拷贝,可以发现这个操作相当于是用一串宏定义去代替,宏定义是由预处理器展开,没有函数的参数压栈,减少调用开销,所以非常高效。
假如,我们知道拷贝的数据是固定长的字节,比如一个整数,4个字节,那么采用memcpy也是可以的,但是显然,不需要做那么多循环操作,分4次将整数数据拷过去。而且跨平台时遇到大小端的时候,可能会出现错误。
所以结论是memcpy不适用的场景:
- 源地址与目的地址有重叠的情况。不安全。
- 固定小字节数据的拷贝。效率低。
- 涉及到跨平台编程时的小字节数据的拷贝。不安全。
一个月后,当chatgpt开始流行,我将这个问题输入到chatgpt,得到了这个问题的"标准答案",事情变得简单了很多