Experiments on Postagger

Experiments' result in detail on Postagger is recorded as following :

##Dataset

using PKU-WEIBO as the training , developing , and test data . Detail infomation is following :

dataset	sentence number	word-tags pair(tokens) number
Training	337,422	7,720,621
Developing	8,000	172,054
Test	12,500	271,786

##Model Structure Info

KEY	VALUE	notes
word dict size	176,047	build from training data
word embedding dimension	50	various on different experiment
BI-LSTM hidden Layer dimension	100	-
BI-LSTM stacked layers number	1	-
Tag hidden dimension	32	-
Tag output dimension(tag number)	28	build from training data

##Experiments

###word embedding with Randomized Initialization

####Train

Training Epoch	Training Acc	validation Acc	Speed(seconds / 10k samples)	Memory cost
1	94.07%	94.75%	120.71*	-
2	96.78%	95.44%	119.58	-
3	97.32%	95.62%	117.39	-
4	97.62%	95.96%	115.05	-
5	97.82%	96.04%	116.06	-
6	97.95%	96.02%	113.92	-
7	98.06%	96.12%	114.01	-

* Runing on Node01

####Test

ACC = 96.0925 %

Appending Test

using the best model on developing , we got accuracy on relating dataset .

@PKU

develop acc : 97.8118 %

test acc : 97.8933 %

@WEIBO

develop acc : 92.7685 %

test acc : 92.8434 %

###word embedding loading from gigawords

####Train

Hit Rate

gigawords embedding number	model word dict size	hit rate(load successfully)
335,696(0.34 M)	176,047(0.18M)	95,359/176,047(54.17%)

Result

Training Epoch	Training Acc	validation Acc	Speed(seconds / 10k samples)	Memory cost
1	93.97%	93.60%	94.01*	-
2	95.65%	94.45%	98.13	-
3	96.23%	94.67%	111.09	-
4	95.56%	95.04%	102.25	-
5	96.80%	95.20%	102.04	-
6	96.95%	95.16%	101.56	-
7	97.07%	95.28%	100.76	-

* runing on Node05

####Test

ACC = 95.3254 %

####Appending Test

@PKU

devel acc : 97.1792 %

test acc : 97.2476 %

@WEIBO

devel acc : 91.5375 %

test acc : 91.857 %

Analysis

The accuracy is totally lower than model with randomized initialization .

May be , only half word embedding hit rate cause the strange result . The conflict came into being with combination of pre-trained word embedding and randomized initialization .

word embedding load from Sogou-news-corpus

####Train

Hit Rate

gigawords embedding number	model word dict size	hit rate(load successfully)
1,354,247(1.35 M)	176,047(0.18M)	119,245/176,047(67.73%)

Result

Training Epoch	Training Acc	validation Acc	Speed(seconds / 10k samples)	Memory cost
1	93.70%	93.02%	177.61*	-
2	95.14%	93.81%	202.71	-
3	95.68%	94.17%	213.92	-
4	96.03%	94.41%	215.28	-
5	96.27%	94.52%	220.29	-
6	96.46%	94.55%	220.76	-
7	96.61%	94.68%	215.25	-

* runing on Node06

####Test

ACC = 94.6491 %

####Appending Test

@PKU

devel acc : 96.7085 %

test acc : 96.7084 %

@WEIBO

devel acc : 90.6736 %

test acc : 90.9336 %

Analysis

It is so lowly on Node06 , so as the bad accuracy ....

##result summarizing

word embedding initialization method	hit rate	train acc*	best devel acc(@PKU-WEIBO)	test acc(@PKU-WEIBO)	devel acc(@PKU)	test acc(@PKU)	devel acc(@WEIBO)	test acc(@WEIBO)	speed(s/10k)
random	-	98.06%	96.12%	96.10%	97.8118 %	97.8933 %	92.7685 %	92.8434 %	114.01(CPU@node01)
gigawords	54.17%	97.07%	95.28%	95.33%	97.1792 %	97.2476 %	91.5375 %	91.8570 %	100.76(CPU@node05)
sogou-news	67.73	96.61%	94.68%	94.65%	96.7085 %	96.7084 %	90.6736 %	90.9336 %	215.25(CPU@node06)

* The train acc is the result of the epoch where validation get the best result . So as the test_acc , speed

keep 4 digits after the point for results on devel and test at pku and weibo . Because LTP using this precession .

###Next

more epoches should be demonstrated
smaller devel granularity( for example , not just on every epoch , but on 5,0000 samples trainning. )
get the un-hit words , and decide what to do with decreasing when loading outer word embedding .

基于神经网络的序列标注任务 - WIKI (wiki语法见gollum)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Experiments on Postagger

Appending Test

word embedding load from Sogou-news-corpus

Clone this wiki locally