Kazushige Goto (goto@statabo.rim.or.jp)
Sun, 29 Nov 1998 12:40:47 +0900
From: Jonathan L Dubois <dubois@bec.physics.udel.edu>
Date: Sat, 28 Nov 1998 14:28:39 -0500 (EST)
dubois> The only problem I foresee with doing all the sqrts in r12 at once is that
dubois> I occasionally get some computational savings when (rij <= b) The
dubois> probability of this occurring is usually fairly low but it increases with
dubois> N and is not easily predicted ahead of time. It will also probably
dubois> require some restructuring since q[i] is an array of structures which
dubois> contain information other than the 3vectors of interest here.
I modified your routine. One is slightly optimized, and another is
vectorlized sqrt version.
1. light optimize
The overhead of r12 is pretty heavy.
int find_jastrow(double *result,int j,struct coord *q,double r[3],double b){
int i;
double rij;
double tmp;
double tmp1, tmp2, tmp3;
double r1, r2, r3;
r1 = r[0];
r2 = r[1];
r3 = r[2];
tmp = 1.0;
if(b>0){
for(i=0;i < N;i++){
if(i!=j){
tmp1 = q[i].x[0]-r1;
tmp1 = tmp1*tmp1;
tmp2 = q[i].x[1]-r2;
tmp2 = tmp2*tmp2;
tmp3 = q[i].x[2]-r3;
tmp3 = tmp3*tmp3;
rij = sqrt(tmp1 + tmp2 + tmp3);
if(rij > b){
tmp *= (1.0 - b/rij);
}
else{
*result = 0;
return 0;
}
}
}
}
*result = tmp;
return 1;
}
2. using sqrti routine
tmp *= (1.0 - b/rij);
is pretty slow. You can use sqrti routine instead of sqrt.
This "sqrti" routine(calculates 1.0/sqrt() quickly) will be included
next libffm version.
double sqrti(double);
int find_jastrow(double *result,int j,struct coord *q,double r[3],double b){
int i;
double rij;
double tmp;
double tmp1, tmp2, tmp3;
double r1, r2, r3;
double bi = 1./b;
r1 = r[0];
r2 = r[1];
r3 = r[2];
tmp = 1.0;
if(b>0){
for(i=0;i < N;i++){
if(i!=j){
tmp1 = q[i].x[0]-r1;
tmp1 = tmp1*tmp1;
tmp2 = q[i].x[1]-r2;
tmp2 = tmp2*tmp2;
tmp3 = q[i].x[2]-r3;
tmp3 = tmp3*tmp3;
rij = sqrti(tmp1 + tmp2 + tmp3);
if(rij < bi){
tmp *= (1.0 - b * rij);
}
else{
*result = 0;
return 0;
}
}
}
}
*result = tmp;
return 1;
}
3. using vectorlized sqrtiv routine.
The sqrti routine is still bottle neck. We can modify to use
vectorlized sqrti routine. But it'll be slower if the probability of
"rij < bi" is low. If it's much improved(depend on your source
parameter), this routine can be faster still more.
I attached libffm.a, but it is a beta version.
void dsqrtiv(double *, double *, int);
int find_jastrow(double *result,int j,struct coord *q,double r[3],double b){
int i;
double tmp;
double tmp1, tmp2, tmp3;
double r1, r2, r3;
double bi;
double vtemp[N];
r1 = r[0];
r2 = r[1];
r3 = r[2];
tmp = 1.0;
if(b>0){
for(i=0;i < N;i++){
tmp1 = q[i].x[0]-r1;
tmp1 = tmp1*tmp1;
tmp2 = q[i].x[1]-r2;
tmp2 = tmp2*tmp2;
tmp3 = q[i].x[2]-r3;
tmp3 = tmp3*tmp3;
vtemp[i] = tmp1 + tmp2 + tmp3;
}
dsqrtiv(vtemp, vtemp, N);
bi = 1.0/b;
for(i=0;i < N;i++){
if(i!=j){
if(vtemp[i] < bi){
tmp *= (1.0 - b * vtemp[i]);
}
else{
*result = 0;
return 0;
}
}
}
}
*result = tmp;
return 1;
}
--- goto@statabo.rim.or.jp
This archive was generated by hypermail 2.0b3 on Sun Nov 29 1998 - 08:32:29 EST