help me improve this merge sort algorithm

AnnieRuru

~~Cute~Cute~Scripter~~
Messages
1,677
Points
0
Location
your next door ~
Discord
AnnieRuru#1609
Github
AnnieRuru
Emulator
Client Version
2019-05-30aRagexeRE
ok I just recently cracked @KeyWorld's merge sort algorithm in this topic
http://www.eathena.ws/board/index.php?s=&showtopic=180080&view=findpost&p=1293102

here is the script
http://rathena.org/board/pastebin/jdsshuef1ic/

I have 2 questions

1.
why is that comb sort runs faster than merge sort ??

in theory, merge sort should be better than comb sort, right ?
O(n log n) is better than O(n2)

however, in practical test on rathena emulator
when .@total = 100,
comb sort run 16~32 mili-seconds
merge sort run 31~47 mili-seconds

something is wrong here

my 1st assumption is my merge sort is not efficient, maybe it can be improve
... however @Euphy told me maybe its just script engine problem ...

can someone test here with hercules emulator and confirm comb sort is faster than merge sort ?

==============================

2.
it seems I'm not able to write this kind of algorithm
I'm trying to replicate the way that I have used in my comb_sort function

dispbottom .@array[ .@r[.@i] ] +"";
so that I can list out the index of the array
I believe display with the index of the sorted array is better than getting output from sorted array
because like I said in this topic
http://rathena.org/board/topic/92070-suggestion-new-scriptcommand-search-array/?p=242896
I want the index of the sorted array to display the highest score, or the lowest number

I wish someone can point me some light on how to get the sorted index done
because comb sort use index swap, and I can easily do it, but merge sort use array merge <.<

 
Last edited by a moderator:
Hi Annie

I'll check later more deeply your script to see if there are rooms for improvements, not much time currently.

But from what I remember (maybe not the case anymore) the script engine do not like recursion (call functions, return to the previous state) and that's what slowing down the script.

 
the script engine do not like recursion (call functions, return to the previous state) and that's what slowing down the script.
YES !now I remember this topic

http://www.eathena.ws/board/index.php?showtopic=237976

we have already discussed before that *goto is not evil ... LOL !

so using multiple callfunc/function/callsub in the script is the one that slow down the script execution time

since these commands parse very ugly

I guess there is no way to improve this script further, since this is a script engine problem

1st question solved

2nd part remains

 
Seems like you can improve the script :

I don't really understand why you create a temporary array, and at least, your (2) problem is related to the fact that you don't bind your output array with your temporary array content.

Seems like you don't use argument 4 of mergesort and this functions can be improve to avoid executing the if condition at each recursion.

And at least in merge function the three last while() can be optimized using our favorite copyarray command.

If it's still slow, some solutions exists to implement the merge sort without relying on recursions
default_tongue.png


 
Here's some optimizations to those functions, as well as an iterative, bottom-up implementation of mergesort.

http://upaste.me/20dc8e

This is how they perform on my Hercules copy, using 1000-element arrays:

Mergesort (recursive):

Code:
[Debug]: script debug : 0 110000001 : time used -> 131 milli-seconds
Mergesort (bottom-up):
Code:
[Debug]: script debug : 0 110000002 : time used -> 89 milli-seconds
Combsort:
Code:
[Debug]: script debug : 0 110000003 : time used -> 98 milli-seconds
For question 2: what about a parallel array of indices, that you swap when you swap the main one?
 
Last edited by a moderator:
You'll have even better performance if you replace the for loop with a while loop and the two condition with a ternary
default_smile.png


I can't test my iterative merge sort, I'm currently stuck with a script engine bug, will have to do some work around soon.

 
I don't really understand why you create a temporary array, and at least, your (2) problem is related to the fact that you don't bind your output array with your temporary array content.
the reason that I want to create a temporary array $@tmp_array is to not mess up with the original input arrayin the original version merge sort you've wrote

the .@array[] will be also sorted along with the .@output[]

when I display the .@array[] after the callfunc "mergesort", this array is sorted !

.

.

And at least in merge function the three last while() can be optimized using our favorite copyarray command.
yeah ... hahaha .. just noticed it too
default_biggrin.png
.

.

If it's still slow, some solutions exists to implement the merge sort without relying on recursions
default_tongue.png
yes its still slow,now I'm using Hercules emulator to test, the array index can go up to 2^32 (4b...)

just like Haru did, he confirm your iterative merge sort still slower than comb sort

.

.

and wait, I just knew merge sort has 2 versions !

http://en.wikipedia.org/wiki/Merge_sort

wait, just how many versions merge sort has ?

nonono .. my head gonna explode 1st hahaha !

.

.

Here's some optimizations to those functions, as well as an iterative, bottom-up implementation of mergesort.
yeah I immediately noticed some modification on the comb sortbut you just bug the script lol

http://upaste.me/c50310475cfb4a976

screenshot -> http://imageshack.com/a/img33/8201/81nz.jpg

obviously you only see the last few index when having over 1000 index

but the 1st few index are not sorted properly

so, I have to use back my own comb sort

honestly, how fast is your computer o.o

same test as yours

.@total = 1000; generate rand(10000);

comb sort -> 468-530 mili-seconds

iterative merge sort -> 484-499 mili-seconds

bottom-up merge sort -> 327-343 mili-seconds (*own*)

.

.

2nd,

keyworld is right

while-loop is faster than for-loop when parse in script engine

the is even a significant difference with this script

same test as before

your original for-loop -> 343-359 mili-seconds

I change into while-loop -> 327-343 mili-seconds

keyworld has proposed an improvements to the looping with rathena script engine

http://rathena.org/board/topic/81457-hardcoded-script-commands-improvement/

any chance this might get implement in hercules ?

.

.

Seems like you don't use argument 4 of mergesort and this functions can be improve to avoid executing the if condition at each recursion.
For question 2: what about a parallel array of indices, that you swap when you swap the main one?
that's because I want to output the sorted array indexnot the sorted array result

Haru also copyarray .@arr from getarg(0) in the comb sort function

which is not I want

.

.

http://rathena.org/board/topic/92070-suggestion-new-scriptcommand-search-array/?p=242896

when search through an index, the script usually already has 2 or more array being called

example like

getinventorylist

@inventorylist_id[] ...

@inventorylist_amount[]

if you want to sort @inventorylist_id array, the @inventorylist_amount array needs to sort as well
here is a real examplehttp://upaste.me/1d2110474e08867d3

screenshot -> http://imageshack.com/a/img33/778/oyn5.jpg

EDIT: -> forgot to tell getitemname2 function come from here

if you try to do dispbottom .@output[.@i], .@output is actually just the index of the array

I want to sort the ID, I also want to sort the parallel array such as @inventorylist_amount and so on

so I only output the sorted index and put them in as @inventorylist_id[ .@output[.@i] ]; and so on along with the parallel arrays

and I have no idea how to do like this on iterative merge sort,

it seems the original array is being sorted as well

but @Haru bottom-up merge sort doesn't seem to be the case

maybe haru can write one xD

======== EDIT ===================

summary ... I think this post too long read

1. throw away the iterative merge sort

2. I want Hercules core developer to take a look at keyworld's script command optimizations and get it implement

3. I like haru's bottom-up merge sort, any way to get that function to output sorted index ?

 
Last edited by a moderator:
You'll have even better performance if you replace the for loop with a while loop and the two condition with a ternary
default_smile.png
Yup, I think so too (decreasing readability but oh well, if a script is performance-critical) 
yeah I immediately noticed some modification on the comb sort

but you just bug the script lol

http://upaste.me/c50310475cfb4a976

screenshot -> http://imageshack.com/a/img33/8201/81nz.jpg

obviously you only see the last few index when having over 1000 index

but the 1st few index are not sorted properly

so, I have to use back my own comb sort
Ah I just noticed an error in the script. In the do/while condition, there's a .@gap > 10 that should be .@gap10 > 10 (I had to introduce that extra gap10 variable because of some checks that were getting messed up because of the division by 1.3, and forgot to update it through the script.
honestly, how fast is your computer o.o

same test as yours

.@total = 1000; generate rand(10000);

comb sort -> 468-530 mili-seconds

iterative merge sort -> 484-499 mili-seconds

bottom-up merge sort -> 327-343 mili-seconds (*own*)
Not so much compared to today's standards, it's a late-2009 iMac with an i7-870 cpu... But it still works reasonably well
default_tongue.png
2nd,

keyworld is right

while-loop is faster than for-loop when parse in script engine

the is even a significant difference with this script

same test as before

your original for-loop -> 343-359 mili-seconds

I change into while-loop -> 327-343 mili-seconds
Yes, this is certainly true. As keyworld said, they compile to different assembly, and the for one calls more jump commands.
keyworld has proposed an improvements to the looping with rathena script engine

http://rathena.org/board/topic/81457-hardcoded-script-commands-improvement/

any chance this might get implement in hercules ?
With some luck, we might be able to get an even faster implementation. Just theoretical right now, but there's something in the works
default_smile.png
Haru also copyarray .@arr from getarg(0) in the comb sort function

which is not I want
That's to avoid affecting the original array, and let the function work on a local copy when it does the swaps.
http://rathena.org/board/topic/92070-suggestion-new-scriptcommand-search-array/?p=242896

when search through an index, the script usually already has 2 or more array being called

example like

getinventorylist

@inventorylist_id[] ...

@inventorylist_amount[]

if you want to sort @inventorylist_id array, the @inventorylist_amount array needs to sort as well
here is a real examplehttp://upaste.me/1d2110474e08867d3

screenshot -> http://imageshack.com/a/img33/778/oyn5.jpg

EDIT: -> forgot to tell getitemname2 function come from here

if you try to do dispbottom .@output[.@i], .@output is actually just the index of the array

I want to sort the ID, I also want to sort the parallel array such as @inventorylist_amount and so on

so I only output the sorted index and put them in as @inventorylist_id[ .@output[.@i] ]; and so on along with the parallel arrays
Hmm, I see. Then how about, instead of using the .@output array as index, you sort @inventorylist_id, and whenever you do one swap on it, you do the very same swap on @inventorylist_amount as well?
Here's two examples (untested, sorry if there's any error). First one sorts in place *both* the original arrays, so you just need to print @inventorylist_id[.@i] and @inventorylist_amount[.@i]. Second one should be what you wanted, it doesn't sort the original arrays at all, and only returns the array of sorted indices, so that you need to print @inventorylist_id[ .@output[.@i] ] and @inventorylist_amount[ .@output[.@i] ]:

http://upaste.me/5c1c8f

(changing the for() to while() is left as an exercise to the reader
default_tongue.png
)

//Edit: did I ever say I hate IPB's broken bbcode parser?

 
Last edited by a moderator:
Ah I just noticed an error in the script. In the do/while condition, there's a .@gap > 10 that should be .@gap10 > 10 (I had to introduce that extra gap10 variable because of some checks that were getting messed up because of the division by 1.3, and forgot to update it through the script.
wow ! that's really fix ithttp://upaste.me/a63a10487f4315bee

my comb sort -> 468-514 miliseconds

your comb sort -> 483-500 miliseconds

although it doesn't seem to improve the execution time

but the range/varies seems to be more stable

... maybe I can learn a thing or two from here XD

.

.

.... Second one should be what you wanted, it doesn't sort the original arrays at all, and only returns the array of sorted indices, so that you need to print @inventorylist_id[ .@output[.@i] ] and @inventorylist_amount[ .@output[.@i] ]:

http://upaste.me/5c1c8f
YES exactly what I wanthttp://upaste.me/78f11048686c8fbc4

everything works perfect

this merge sort will be the one I'll be using from now on

its solved now

you know ... for some reason the answer, merge-sort that I want runs 450 mili-seconds

but your original one on post#5 only runs 350 mili-seconds

however its still 50 mili-seconds faster than comb-sort
default_tongue.png


 
Yep, it's slower because it does more operations, since it needs to swap values from two arrays instead of one~

 
Optimized version of the merge sort index :

Code:
function	script	mergeSortIndex	{	.@size = getarg(2);	copyarray .@arr, getarg(0), .@size;	while (.@i < .@size) {		set .@idx[.@i], .@i;		.@i++;	}	.@width = 1;	while (.@width < .@size) {		.@left = 0;		while (.@left < .@size) {			.@middle = .@size < .@left + .@width     ? .@size : .@left + .@width;			.@right  = .@size < .@left + .@width * 2 ? .@size : .@left + .@width * 2;			.@l      = .@left;			.@m      = .@middle;			while (.@l < .@middle && .@m < .@right) {				if (.@arr[.@l] < .@arr[.@m]) {					.@tmp_arr[.@left] = .@arr[.@l];					.@tmp_idx[.@left] = .@idx[.@l];					.@l++;				} else {					.@tmp_arr[.@left] = .@arr[.@m];					.@tmp_idx[.@left] = .@idx[.@m];					.@m++;				}				.@left++;			}			if (.@middle - .@l) {				copyarray .@tmp_arr[.@left], .@arr[.@l], .@middle - .@l;				copyarray .@tmp_idx[.@left], .@idx[.@l], .@middle - .@l;				.@left += .@middle - .@l;			}			if (.@right - .@m) {				copyarray .@tmp_arr[.@left], .@arr[.@m], .@right - .@m;				copyarray .@tmp_idx[.@left], .@idx[.@m], .@right - .@m;				.@left += .@right - .@m;			}		}		.@width *= 2;		copyarray .@arr, .@tmp_arr, .@size;		copyarray .@idx, .@tmp_idx, .@size;	}	copyarray getarg(1), .@tmp_idx, .@size;	return;}
 
that's cheating !
just because you understand more about *eathena scripting engine works
this function is only optimized with *eathena/hercules scripting engine only LOL !

but yeah

http://upaste.me/27f51050340b8af47

haru's merge sort -> 452-468 mili-seconds

keyworld's merge sort -> 405-422 mili-seconds

actually I'm not that amazed though

I just learn that set getelementofaray( getarg(x) ) is not a good practice <.<

.

.

now I have a quick question ...

is bottom-up merge sort and top-down merge sort has any advantages in certain scenario ?

example like cocktail sort has advantages if the array has already in inverted order ready ...

 
It's not cheating, it's just reducing commands calls
default_smile.png


The best for now is counting sort in Hercules. Can't write it now in my phone but it will execute faaaar faster.
 
Edit: So I checked today in a slow server, with some stupids greats optimizations in all functions. Here my results using 1000 elements :

[Debug]: script debug : 0 110144879 : ==== [SORT] Testing for 1000 elements.[Debug]: script debug : 0 110144879 : [merge_sort] Time used -> 405 ms [PASSED][Debug]: script debug : 0 110144879 : [comb_sort] Time used -> 394 ms [PASSED][Debug]: script debug : 0 110144879 : [counting_sort] Time used -> 57 ms [PASSED] [Debug]: script debug : 0 110144878 : ==== [SORT INDEX] Testing for 1000 elements.[Debug]: script debug : 0 110144878 : [merge_sort_index] Time used -> 608 ms [PASSED][Debug]: script debug : 0 110144878 : [comb_sort_index] Time used -> 483 ms [PASSED]

So as you can see, a hacky Comb Sort is faster than a hacky merge sort in all cases.

If you want to cheat even more, you can use my hackyyyy counting sort (but it does not sort indexes).

Have fun: http://rathena.org/board/pastebin/2kto2abymrp/

 
Last edited by a moderator:
rathena support .@i++

but doesn't support ++.@i

so no, it doesn't support on rathena

.

.

The best for now is counting sort in Hercules.

If you want to cheat even more, you can use my hackyyyy counting sort (but it does not sort indexes).
that's the reason I overlook counting sort as an optionbecause in practical use there is always a parallel array

originally I also thought counting sort cannot output as the array index

but after looking at your sample ... now I have a 2nd opinion ...

your so-called hacking method is not new to me because you have already demonstrated 3 years ago in this shuffle algorithm topic

and I also able to reproduced it until this version

it was made with rathena script engine limitation in mind,

so now I have to update it again to make it compatible with hercules new scripting engine

.

.

in your sample, counting sort doesn't include sorting array index so I think I have to do itI think it might be possible by faking a 2-dimensional array like

Code:
.@tmp[.@_arr[.@i] + .@neg]++;
into
Code:
setd ".@tmp_"+( .@arr[.@i] + .@neg )+"["+ getarraysize( getd( ".@tmp_"+( .@arr[.@i] + .@neg ) ) ) +"]", .@idx[.@i];
and this will still retain the complexity of O(n)
 
Last edited by a moderator:
So rathena sux
default_sad.png


Only implement a buggy version of .@var++ sux a lot...

So let say I forget about rathena for this script now.

I know about getd() and setd(), the only problem with this command is the memory leak it create (specially in this case), so I'm not fan of it anymore
default_sad.png


So here a new version, with the *counting_sort_index*, and also a big bonus : asm functions (for fun).

http://upaste.me/2d0f10561576aed88

Code:
[Debug]: script debug : 0 110198899 : ==== [SORT INDEX] Testing for 1000 elements.[Debug]: script debug : 0 110198899 : [merge_sort_index] Time used -> 610 ms [PASSED][Debug]: script debug : 0 110198899 : [merge_sort_index_asm] Time used -> 578 ms [PASSED][Debug]: script debug : 0 110198899 : [comb_sort_index] Time used -> 508 ms [PASSED][Debug]: script debug : 0 110198899 : [comb_sort_index_asm] Time used -> 476 ms [PASSED][Debug]: script debug : 0 110198899 : [counting_sort_index] Time used -> 94 ms [PASSED][Debug]: script debug : 0 110198899 : [counting_sort_index_asm] Time used -> 91 ms [PASSED][Debug]: script debug : 0 110198900 : ==== [SORT] Testing for 1000 elements.[Debug]: script debug : 0 110198900 : [merge_sort] Time used -> 407 ms [PASSED][Debug]: script debug : 0 110198900 : [merge_sort_asm] Time used -> 378 ms [PASSED][Debug]: script debug : 0 110198900 : [comb_sort] Time used -> 385 ms [PASSED][Debug]: script debug : 0 110198900 : [comb_sort_asm] Time used -> 343 ms [PASSED][Debug]: script debug : 0 110198900 : [counting_sort] Time used -> 57 ms [PASSED][Debug]: script debug : 0 110198900 : [counting_sort_asm] Time used -> 53 ms [PASSED]
 
since keyworld counting sort doesn't really work

-> it display all same index in the output array

I write my own counting sort

use upaste then

http://upaste.me/9aa421992effb80b1

now test how fast is it compare to merge sort, comb sort and counting sort

http://upaste.me/42e2219938d15d0ff

Haru's merge sort -> 499~531 mili-seconds

Keyworld's merge sort -> 453~468 mili-seconds

Haru's comb sort -> 436~453 mili-seconds

Keyworld's comb sort -> 390~437 mili-seconds

counting sort -> 109~125 mili-seconds

I'll stick to my counting sort from now on

Code:
function	script	counting_sort_index	{	.@total = .@size = getarg( 2, getarraysize( getarg(0) ) );	copyarray .@arr, getarg(0), .@size;	while ( .@i < .@size ) {		setd ".@index_"+ .@arr[.@i] +"["+( .@tmp[.@arr[.@i]] )+"]", .@i;		.@tmp[.@arr[.@i]]++;		.@i++;	}	do {		.@index = getarraysize(.@tmp) -1;		.@tmp[.@index]--;		.@out[.@size-1] = getd( ".@index_"+ .@index +"["+( .@tmp[.@index] )+"]" );		.@size--;	} while( .@size );	copyarray getarg(1), .@out, .@total;	return;}
addtionally, asm method doesn't run faster anymore, seems haru already upgrade our script engine
 
Last edited by a moderator:
wow ... after I learn about the prefix increment ... it really do run faster than post-fix increment

http://upaste.me/a70722381339369d1

postfix increment -> 109~125 mili-seconds ( same as above )

prefix increment -> 93~109 mili-seconds

now stick to this type

Code:
//	callfunc "counting_sort_index", <input array>, <sorted array> {, no. of elements };function	script	counting_sort_index	{	.@total = .@size = getarg( 2, getarraysize( getarg(0) ) );	copyarray .@arr, getarg(0), .@size;	while ( .@i < .@size )		setd ".@index_"+ .@arr[.@i] +"["+( .@tmp[.@arr[.@i]]++ )+"]", .@i++;	do {		.@index = getarraysize(.@tmp) -1;		.@out[--.@size] = getd( ".@index_"+ .@index +"["+( --.@tmp[.@index] )+"]" );	} while( .@size );	copyarray getarg(1), .@out, .@total;	return;}
 
Back
Top