C/C++&Java 字符串拼接效率对比
C/C++&Java 字符串拼接效率对比
C/C++拼接字符串的方式很多,尤其是所谓的”VC++”语言(不应该称其为语言,但是很多使用VC++的程序员却把它变成了VC++语言即微软血统的C++语言)中这种方式又变多一些。在Windows下的Visual C++中很多的程序员习惯使用微软MFC中CString 。虽然使用MFC编程时使用CString来处理字符串的确有着与MFC库配合的便利。但是在开发中如果并没有用到MFC的时候我建议还使用 Standard C++的字符串。好处很多首先它是标准的意味着你使用它(std::string, std::wstring)来处理字符串。并且你的程序没有使用与具体操作系统相关的API (或使用了跨平台的相关类库代替)。 那么意味着你的代码在有对应平台的并且符合C++标准的编译器下可以直接编译通过。也意味着你的程序具备较好的跨平台能力。另外在性能方面也较为出色。下面测试了一下在VC++2008中各种字符串拼接方式的性能对比。当然这里面有有预分配方式的,测试程序中最后会根据每一项的耗时进行排序,当然排序的结果需要看是否采用预分配方式。当然C++ std::string, std::wstring中有一些提高效率的机制(在非预分配方式下)。然后又在Java中测试了一下字符Java的StringBuffer及StringBuilder字符串拼接效率也很不错。最后测试的结果有助于指导我们使用最高效的方式处理字符处拼接。
字符串拼接测试方式分别是:
1.C语言在预分配内存上的strcpy函数
2.C语言在预分配内存上的 memcpy函数
3.C++中std::string的 append函数
4.C++中std::string 的 operator+=
5.C++中std::ostringstream的 operator<<
6.C++中std::string使用事先预分配(调用 reserve函数)
7.MFC中CString的Append函数
8.MFC中CString的operator+=
以下测试代码:
注:在 const int TEST_TIMES = 9000000;时C的memcpy使用16-32 ms(毫秒) std::string 也在 1000 ms(毫秒)左右。由于其他的方式耗时过长因此将 TEST_TIMES = 90000次 在Java中普通的String也是如此.
- C/C++ code
#include "stdafx.h"#include <string>#include <sstream>#include <iostream>#include <cassert>#include <cstring>#include <ctime>#include <vector>#include <algorithm>#include <afxwin.h>using namespace std;const int TEST_TIMES = 90000;const char APPEND_CONTENT[] = "cppmule";const int PREALLOCATE_SIZE = strlen(APPEND_CONTENT) * TEST_TIMES + 1;class Statistic {public: string item; int used; friend inline ostream & operator << (ostream & os, const Statistic &stat) { os << "item: " << stat.item << endl << "used: " << stat.used << " ms." << endl << endl; return os; } inline bool operator>(const Statistic& stat) { return (used > stat.used); } inline bool operator<(const Statistic& stat) { return (used < stat.used); }};vector<Statistic> g_statistics;#define BEGIN_TICK() \ clock_t start = clock();#define END_AND_PRINT_TICK(info) \ clock_t used = clock() - start; \ Statistic stat; \ stat.item.assign(info); \ stat.used = used; \ g_statistics.push_back(stat); \ cout << info << " Used: " << used << " ms." << endl;#define PRINT_SORT_TEST_TICKS() \ sort(g_statistics.begin(), g_statistics.end()); \ struct StatisticPrinter { \ StatisticPrinter() : order(0) {} \ void operator() (const Statistic& stat) { \ ++order; \ cout << "sort order: " << order << endl \ << stat; \ } \ int order; \ } printer; \cout << "---------Statistics informations(sorting ascendent)-------" << endl << endl; \ for_each(g_statistics.begin(), g_statistics.end(), printer);\ cout << "----------------------" << endl;void test_stdstring_append(){ string str; BEGIN_TICK(); for (int i=0; i<TEST_TIMES; i++) { str.append(APPEND_CONTENT); } END_AND_PRINT_TICK("std::string append"); cout << "string length: " << str.length() << endl;}void test_stdstring_append_operator(){ string str; BEGIN_TICK(); for (int i=0; i<TEST_TIMES; i++) { str += APPEND_CONTENT; } END_AND_PRINT_TICK("std::string += operator"); cout << "string length: " << str.length() << endl;}void test_stdostringstream(){ ostringstream oss; BEGIN_TICK(); for (int i=0; i<TEST_TIMES; i++) { oss << APPEND_CONTENT; } END_AND_PRINT_TICK("std::ostringstream <<"); cout << "string length: " << oss.str().length() << endl;}void test_stdostringstream_preallocate(){ ostringstream oss; BEGIN_TICK(); for (int i=0; i<TEST_TIMES; i++) { oss << APPEND_CONTENT; } END_AND_PRINT_TICK("std::ostringstream <<"); cout << "string length: " << oss.str().length() << endl;}void test_stdstring_append_operator_preallocate(){ string str; str.reserve(PREALLOCATE_SIZE); cout << "capacity: " << str.capacity() << endl << "size: " << str.size() << endl; //assert(str.capacity() == 1024*1024*512); BEGIN_TICK(); for (int i=0; i<TEST_TIMES; i++) { str += APPEND_CONTENT; } END_AND_PRINT_TICK("hava resize(PREALLOCATE_SIZE) std::string += operator"); cout << "string length: " << str.length() << endl;}void test_c_strcat_append(){ char* pstr = (char*)malloc(PREALLOCATE_SIZE); memset(pstr, 0, sizeof(pstr)); BEGIN_TICK(); for (int i=0; i<TEST_TIMES; i++) { strcat(pstr, APPEND_CONTENT); } END_AND_PRINT_TICK("c string function strcat:"); cout << "string length: " << strlen(pstr) << endl; free(pstr); pstr = NULL;}void test_c_memcpy_append(){ //Allocate memory char* pstr = (char*)malloc(PREALLOCATE_SIZE); if (NULL == pstr) { cerr << "Can't allocate memory." << endl; return; } memset(pstr, 0, PREALLOCATE_SIZE); BEGIN_TICK(); int len = 0; for (int i=0; i<TEST_TIMES; i++) { memcpy(pstr + len, APPEND_CONTENT, strlen(APPEND_CONTENT)); len += strlen(APPEND_CONTENT); } END_AND_PRINT_TICK("C language memcpy append"); cout << "string length: " << strlen(pstr) << endl; //Cleanup free(pstr); pstr = NULL;}void test_mfc_cstring_append(){ CString str; BEGIN_TICK(); for (int i=0; i<TEST_TIMES; i++) { str.Append(APPEND_CONTENT); } END_AND_PRINT_TICK("MFC CString append"); cout << "string length: " << str.GetLength() << endl;}void test_mfc_cstring_append_operator(){ CString str; BEGIN_TICK(); for (int i=0; i<TEST_TIMES; i++) { str += APPEND_CONTENT; } END_AND_PRINT_TICK("MFC CString operator append"); cout << "string length: " << str.GetLength() << endl;}int _tmain(int argc, _TCHAR* argv[]){#ifdef _DEBUG cout << "DEBUG version." << endl;#else cout << "Release version." << endl;#endif cout << "TEST_TIME: " << TEST_TIMES << endl; test_c_memcpy_append(); test_stdstring_append_operator(); test_stdstring_append(); test_stdostringstream(); test_stdstring_append_operator_preallocate(); test_mfc_cstring_append(); test_mfc_cstring_append_operator(); test_c_strcat_append(); PRINT_SORT_TEST_TICKS(); return 0;}
C/C++字符串拼接测试运行结果:
Release version.
TEST_TIME: 90000
C language memcpy append Used: 0 ms.
string length: 630000
std::string += operator Used: 15 ms.
string length: 630000
std::string append Used: 16 ms.
string length: 630000
std::ostringstream << Used: 16 ms.
string length: 630000
capacity: 630015
size: 0
hava resize(PREALLOCATE_SIZE) std::string += operator Used: 0 ms.
string length: 630000
MFC CString append Used: 63 ms.
string length: 630000
MFC CString operator append Used: 62 ms.
string length: 630000
c string function strcat: Used: 32203 ms.
string length: 630000
---------Statistics informations(sorting ascendent)-------
sort order: 1
item: C language memcpy append
used: 0 ms.
sort order: 2
item: hava resize(PREALLOCATE_SIZE) std::string += operator
used: 0 ms.
sort order: 3
item: std::string += operator
used: 15 ms.
sort order: 4
item: std::string append
used: 16 ms.
sort order: 5
item: std::ostringstream <<
used: 16 ms.
sort order: 6
item: MFC CString operator append
used: 62 ms.
sort order: 7
item: MFC CString append
used: 63 ms.
sort order: 8
item: c string function strcat:
used: 32203 ms.
----------------------
Press any key to continue . . .
Java的测试代码段:
- Java code
void testJavaStringPerformance() {final int TEST_TIMES = 90000; //String String str = new String(); System.out.println("The testing is running, please wait..."); long start = System.currentTimeMillis(); for (int i=0; i<TEST_TIMES; i++) { str += "cppmule"; } long strUsed = System.currentTimeMillis() - start; System.out.println("strUsed: " + strUsed + " ms."); //StringBuffer start = 0; StringBuffer strBuffer = new StringBuffer(); start = System.currentTimeMillis(); for (int i=0; i<TEST_TIMES; i++) { strBuffer.append("cppmule"); } long strBufferUsed = System.currentTimeMillis() - start; System.out.println("StringBuffer append: " + strBufferUsed + " ms."); //StringBuilder start = 0; StringBuilder strBuilder = new StringBuilder(); start = System.currentTimeMillis(); for (int i=0; i<TEST_TIMES; i++) { strBuilder.append("cppmule"); } long strBuilderUsed = System.currentTimeMillis() - start; System.out.println("StringBuilder append: " + strBuilderUsed + " ms."); System.out.println("Times: " + TEST_TIMES);}
Java字符串拼接运行结果:
The testing is running, please wait...
strUsed: 443141 ms.
StringBuffer append: 15 ms.
StringBuilder append: 31 ms.
Times: 90000
测试文档及源码下载地址: http://www.javaeye.com/topic/866216
Author:
By: cppmule
Email: cppmule@gmail.com[size=16px][/size]
[解决办法]
其实楼主的结论是C的strcat和java的+拼接字符串都很慢,如果能够进一步分析为什么他们慢(其实是在楼主列举的这种不断在原有字符串基础上拼接的情况下慢而已)的话,那就更好了。
strcat为什么慢可以参考我写的文章
http://blog.csdn.net/yui/archive/2010/05/22/5616455.aspx
java的+比StringBuffer慢那么多是因为时间都花在对象的创建上面了
[解决办法]
虽然有挖坟的嫌疑,不过我的确是来感谢这个帖子的
看完这个帖子我把我一个号段筛选的程序优化了一下,执行速度提高了很多。
果然老是用+来拼接字符串是不行D